Skip to content

Pinned Loading

  1. vllm vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 60.1k 10.5k

  2. llm-compressor llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python 2.1k 254

  3. recipes recipes Public

    Common recipes to run vLLM

    Jupyter Notebook 161 56

Repositories

Showing 10 of 24 repositories
  • vllm-gaudi Public

    Community maintained hardware plugin for vLLM on Intel Gaudi

    vllm-project/vllm-gaudi’s past year of commit activity
    Python 12 Apache-2.0 51 1 58 Updated Oct 14, 2025
  • vllm-ascend Public

    Community maintained hardware plugin for vLLM on Ascend

    vllm-project/vllm-ascend’s past year of commit activity
    Python 1,202 Apache-2.0 483 561 (7 issues need help) 179 Updated Oct 14, 2025
  • vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    vllm-project/vllm’s past year of commit activity
    Python 60,050 Apache-2.0 10,533 1,815 (31 issues need help) 1,163 Updated Oct 14, 2025
  • semantic-router Public

    Intelligent Mixture-of-Models Router for Efficient LLM Inference

    vllm-project/semantic-router’s past year of commit activity
    Go 1,769 Apache-2.0 209 84 (15 issues need help) 17 Updated Oct 14, 2025
  • llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    vllm-project/llm-compressor’s past year of commit activity
    Python 2,082 Apache-2.0 254 58 (11 issues need help) 40 Updated Oct 14, 2025
  • guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    vllm-project/guidellm’s past year of commit activity
    Python 621 Apache-2.0 88 85 (5 issues need help) 27 Updated Oct 14, 2025
  • vllm-spyre Public

    Community maintained hardware plugin for vLLM on Spyre

    vllm-project/vllm-spyre’s past year of commit activity
    Python 35 Apache-2.0 26 4 16 Updated Oct 13, 2025
  • ci-infra Public

    This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

    vllm-project/ci-infra’s past year of commit activity
    HCL 22 41 0 17 Updated Oct 13, 2025
  • production-stack Public

    vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

    vllm-project/production-stack’s past year of commit activity
    Python 1,846 Apache-2.0 306 84 (3 issues need help) 55 Updated Oct 13, 2025
  • aibrix Public

    Cost-efficient and pluggable Infrastructure components for GenAI inference

    vllm-project/aibrix’s past year of commit activity
    Go 4,296 Apache-2.0 466 233 (22 issues need help) 23 Updated Oct 13, 2025