Skip to content

arclabs561/vicinity

vicinity

crates.io docs.rs PyPI

Approximate nearest-neighbor search. Each algorithm is a separate feature flag. Depend on only what you use. Rust crate vicinity; Python bindings ship as pyvicinity on PyPI.

Default distance is cosine; most graph indices L2-normalize on insert and operate on the cosine-equivalent unit sphere (so angular and inner product over normalized vectors fall out for free). True L2 distance is natively wired only in ivf_avq (which targets MIPS). The DistanceMetric enum in distance.rs is consumed by evaluation utilities and brute-force comparison; per-index metric selection is algorithm-specific.

Which index?

Which index?
├── General purpose: HNSW (default)
├── Memory constrained: HNSW + RaBitQ (SymphonyQG) or IVF-PQ
├── Disk-backed / large scale: DiskANN (experimental; file-based, mmap planned)
├── Streaming insert/delete: FreshGraph (per-op latency) or LsmIndex (write throughput)
├── Filtered search: ACORN (metadata filters) or Curator (label filters)
├── Batch/static: IVF-PQ or IVF-AVQ
└── Sparse vectors: SparseMIPS

For high-dimensional data (d ≥ 256), prefer SQ4U or SymphonyQG over plain HNSW. Quantized graph traversal reduces distance computation cost. At low dimensions (d ≤ 25), plain HNSW wins; quantization overhead outweighs savings.

Install

Requires Rust 1.89+. Each algorithm is a separate feature; enable what you need:

[dependencies]
vicinity = { version = "0.8", features = ["hnsw"] }          # graph index
# vicinity = { version = "0.8", features = ["ivf_pq"] }      # compressed index
# vicinity = { version = "0.8", features = ["nsw"] }         # flat graph

Usage

HNSW

High recall, in-memory. Default index.

use vicinity::hnsw::HNSWIndex;

let mut index = HNSWIndex::builder(128)
    .m(16)
    .ef_search(50)
    .auto_normalize(true) // cosine (default) requires unit-norm vectors
    .build()?;
index.add_slice(0, &[0.1; 128])?;
index.add_slice(1, &[0.2; 128])?;
index.build()?;

let results = index.search(&[0.1; 128], 5, 50)?;
// results: Vec<(doc_id, distance)>; distance in [0, 2], lower is closer.

IVF-PQ

Compressed index. 32–64× less memory than HNSW, lower recall. Use for datasets that don't fit in RAM.

use vicinity::ivf_pq::{IVFPQIndex, IVFPQParams};

// IVF-PQ trains a quantizer on `build()`; aim for ≳ codebook_size × num_codebooks
// training vectors (defaults: codebook_size=256, num_codebooks=8 → ≳ 2048).
let params = IVFPQParams { num_clusters: 1024, num_codebooks: 8, nprobe: 16, ..Default::default() };
let mut index = IVFPQIndex::new(128, params)?;
for (id, vec) in dataset.iter().enumerate() {
    index.add_slice(id as u32, vec)?;
}
index.build()?;

let results = index.search(&query, 5)?;

See examples/ivf_pq_demo.rs for a runnable end-to-end example.

Python (pyvicinity)

HNSW is available from Python via pyvicinity (abi3 wheel, CPython 3.9+). Useful for semantic search, recommendation, deduplication, and any other task where you have N embedding vectors and need to find the k closest to a query vector.

pip install pyvicinity
import numpy as np
from pyvicinity import HNSWIndex, DistanceMetric

# embeddings: any (n, 384) float32 array (sentence-transformers, openai, etc.)
embeddings = np.random.default_rng(0).standard_normal((10_000, 384), dtype=np.float32)

index = HNSWIndex(
    dim=384,
    metric=DistanceMetric.Cosine,
    auto_normalize=True,  # normalizes inserts and queries
    seed=42,
)
index.add_items(embeddings)
index.build()

# single query: top-10 neighbors. Distances in [0, 2]; lower means more similar.
ids, dists = index.search(embeddings[0], k=10)

# batch: same shape, one row per query.
queries = embeddings[:32]
batch_ids, batch_dists = index.batch_search(queries, k=10)  # (32, 10) int64

Runnable examples (in the source repo, under examples/python/, not shipped with the wheel):

  • 01_text_similarity.py: semantic search over text with sentence-transformers
  • 02_batch_and_recall.py: recall@10 vs ef_search sweep
  • 03_ann_benchmarks_harness.py: drop-in ann-benchmarks / VIBE wrapper

The bindings ship hand-written .pyi stubs (py.typed) and are verified in CI by mypy.stubtest.

Persistence

Save and load indexes with the serde feature:

[dependencies]
vicinity = { version = "0.8", features = ["hnsw", "serde"] }
// Save
index.save_to_file("my_index.json")?;

// Load
let index = HNSWIndex::load_from_file("my_index.json")?;

See examples/06_save_and_load.rs for a full example.

Benchmark

GloVe-25 (1.18M vectors, 25-d, angular distance), Apple Silicon, single-threaded:

Recall vs QPS on GloVe-25

Summary at best recall per algorithm:

Algorithm Recall@10 QPS
HNSW (M=16) 100.0% 2,857
Vamana 100.0% 1,177
DiskANN 100.0% 1,029
NSW (M=16) 99.2% 1,288
IVF-PQ 98.7% 69
IVF-AVQ 90.9% 194
RP-Forest 58.5% 4,221

Full numbers and SIFT-128 results in docs/benchmark-results.md.

Supported Algorithms

Each algorithm has a named feature flag:

Algorithm Feature Notes
HNSW hnsw (default) Best recall/QPS balance for in-memory search
SQ4U hnsw + sq4 HNSW with 4-bit quantized graph traversal + exact rerank; benefits high-d data
SymphonyQG hnsw + ivf_rabitq HNSW with RaBitQ quantized graph traversal; cheap approximate beam search + exact rerank
NSW nsw Flat small-world graph; competitive with HNSW on high-d data
Vamana vamana DiskANN-style robust pruning; fast search, higher build time
NSG nsg Monotonic RNG pruning; build slows above ~50K vectors due to O(n) ensure_connectivity
EMG emg Multi-scale graph with alpha-pruning
FINGER finger Projection-based distance lower bounds for search pruning
PiPNN pipnn Partition-then-refine with HashPrune; reduces I/O during build
FreshGraph fresh_graph Streaming insert/delete with tombstones
IVF-PQ ivf_pq Compressed index; 32-64x less memory, lower recall
IVF-AVQ ivf_avq Anisotropic VQ + reranking; inner product search
IVF-RaBitQ ivf_rabitq RaBitQ binary quantization; provable error bounds
RpQuant rp_quant Random projection + scalar quantization
BinaryFlat binary_index 1-bit quantization + full-precision rerank
Curator curator K-means tree with per-label Bloom filters; low-selectivity filtered search
FilteredGraph filtered_graph Predicate-filtered graph search (AND/OR metadata filters)
ACORN hnsw Filtered HNSW search with subgraph sampling (SIGMOD 2024)
RangeFiltered range_filtered HNSW + attribute-range post-filter (renamed from esg in 0.8.0)
SparseMIPS sparse_mips Graph index for sparse vectors (SPLADE/BM25)
LEMUR lemur Late-interaction MIPS; needs externally-provided encoder weights (no in-tree training); mean-pool used in place of OLS
LSH lsh Cross-Polytope LSH (Andoni et al. 2015); single hash table + multiprobe
LsmIndex hnsw LSM-tree tiered HNSW for streaming insert/delete/update workloads
DiskANN diskann Vamana + SSD I/O layout; experimental
SNG sng OPT-SNG (auto-tuned sparse neighborhood graph); sub-quadratic build per arXiv:2509.15531
DEG hnsw Density-adaptive edge budgets (in-house experimental variant; no benchmark)
KD-Tree kdtree Exact NN; fast for d <= 20 (experimental)
Ball Tree balltree Exact NN; slightly better than KD-Tree for d=20-50 (experimental)
RP-Forest rptree Approximate; fast build, moderate recall (experimental)

Quantization: RaBitQ, SAQ (quantization feature, via qntz crate). PQ is part of ivf_pq.

Experimental status

Algorithms tagged experimental are reachable from the public API but have not yet cleared the bar to be recommended defaults. Each has a specific gap that, once closed, would promote it:

  • DiskANN: file-based save/load works; mmap-backed search is not yet wired (entire index loads into RAM on load_from_file). Promote when mmap I/O lands and recall@10 stays competitive on a 1M-vector dataset.
  • DEG: in-house density-adaptive variant of HNSW. No published benchmark vs plain HNSW. Promote when a head-to-head shows a recall or QPS win on at least two ann-benchmarks datasets.
  • KD-Tree, Ball Tree: exact NN, niche use case (d ≤ 20-50). Stable but not heavily optimized. Promote when a representative workload motivates SIMD'd distance kernels and parallel build.
  • RP-Forest: fast build, moderate recall (~58% on GloVe-25 per the benchmark table above). Promote when a seed-selection or projection improvement closes the recall gap to NSW (~99%) at the same QPS.
  • LEMUR: late-interaction MIPS; ships an inference-only skeleton that requires externally-provided encoder weights and uses mean-pool in place of the paper's OLS fit. Promote when in-tree training lands.

See docs.rs for the full API.

Documentation

  • User guide: quick start, distance metrics, LID, common pitfalls
  • Benchmarks: recall/QPS tables across datasets
  • ANN landscape: algorithmic principles, math foundations, research context
  • References: bibliography for every algorithm in the crate

License

MIT OR Apache-2.0

About

Approximate nearest-neighbor search

Topics

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE-APACHE
MIT
LICENSE-MIT

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages