Highlights
- Pro
Stars
Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Make your functions return something meaningful, typed, and safe!
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Build and share delightful machine learning apps, all in Python. π Star to support our work!
Rust tool that supports PVC snapshots across Kubernetes namespaces
A next generation HTTP client for Python. π¦
An extremely fast Python package and project manager, written in Rust.
KServe community docs for contributions and process
A high-throughput and memory-efficient inference and serving engine for LLMs
Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)
The easiest repo for building GPT applications.
[Not Actively Maintained] Whitebox is an open source E2E ML monitoring platform with edge capabilities that plays nicely with kubernetes
Source code for Twitter's Recommendation Algorithm
π¦π Build context-aware reasoning applications
Transform your pythonic research to an artifact that engineers can deploy easily.
π§ Build, run, and manage data pipelines for integrating and transforming data.
Examples and guides for using the OpenAI API
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Github mirror of "machinelearning/liftwing/inference-services" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)
concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
Standardized Serverless ML Inference Platform on Kubernetes
A curated list of awesome actions to use on GitHub
The little ASGI framework that shines. π
Hydra is a framework for elegantly configuring complex applications
π Papers & tech blogs by companies sharing their work on data science & machine learning in production.