QuantKit 🔧📦

A flexible toolkit for customizable transformer model quantization

QuantKit is a modular and extensible framework for building and exporting quantized transformer models with fine-grained control. Whether you're targeting edge deployment, reducing inference latency, or experimenting with quantization strategies, QuantKit gives you all the knobs.

Designed with researchers and engineers in mind, QuantKit supports layer-wise quantization, asymmetric/symmetric schemes, 4/8-bit precision, and LoRA integration for efficient fine-tuning.

🚀 Features

✅ Layer-wise, selective, or full-model quantization
🔢 Supports 4-bit, 8-bit, mixed-precision
⚙️ Implements symmetric and asymmetric quantization
🧠 Compatible with Hugging Face transformers and bitsandbytes
📦 Easily save/load quantized models
🤖 Works with LoRA/PEFT for fine-tuning
🧪 Offers a Python API and CLI for full control

📦 Installation

git clone https://github.com/your-username/quantkit.git
cd quantkit
pip install -e .

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
readme.MD		readme.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

QuantKit 🔧📦

🚀 Features

📦 Installation

About

Uh oh!

Releases

Packages

MuhammadAliS/QuantKit

Folders and files

Latest commit

History

Repository files navigation

QuantKit 🔧📦

🚀 Features

📦 Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages