Skip to content

rlogger/bayes-hdc

bayes-hdc — probabilistic hyperdimensional computing in JAX

Tests Docs CodeQL Coverage PyPI DOI Python License

Documentation · Quickstart in Colab · Examples · Benchmarks · Discussions

Hyperdimensional computing (HDC, also known as vector symbolic architectures) represents data as ~10,000-dimensional vectors combined with cheap elementwise algebra: fast, noise-robust, trivially parallel, and a natural fit for edge hardware. Its weak spot is that predictions come out as raw similarity scores with no notion of confidence. bayes-hdc is the first general-purpose library to fix that: hypervectors that carry distributions, calibrated probabilities, and conformal prediction with finite-sample coverage guarantees. It is JAX end to end — every type is a pytree, so jit, vmap, grad, and pmap compose with everything.

pip install bayes-hdc

Anomaly detection with a guaranteed false-positive rate

The headline use case: one-class anomaly detection where the false-positive rate is guaranteed at your target alpha — finite-sample, distribution-free, not tuned by hand. No other HDC library ships this.

Conformal anomaly detection: empirical false-positive rate tracks the target alpha

Copy-paste runnable:

import numpy as np
from bayes_hdc.sklearn import HDAnomalyDetector

rng = np.random.default_rng(0)
X_normal = rng.normal(size=(500, 16)).astype("float32")        # fit on normal data only
X_test   = np.vstack([rng.normal(size=(50, 16)),
                      rng.normal(loc=6.0, size=(50, 16))]).astype("float32")

det = HDAnomalyDetector(alpha=0.05).fit(X_normal)
labels = det.predict(X_test)        # +1 inlier / -1 outlier; marginal FP rate <= alpha
pvals  = det.score_samples(X_test)  # split-conformal p-values

The JAX-native pipeline underneath (custom encoders, fit_anomaly_pipeline, Benjamini-Hochberg FDR control across a batch of queries) is walked through in tutorials/02_anomaly_detection.py. On one-class versions of three small standard datasets it has the best AUROC on two of three against IsolationForest, LOF, and OneClassSVM, while holding the false-positive rate at the target — a knob none of those baselines have. Numbers and harness: BENCHMARKS.md.

Calibrated probabilities and prediction sets

Hypervectors can carry distributions (GaussianHV, DirichletHV) with closed-form moment propagation through bind and bundle, and any classifier's outputs can be wrapped with temperature scaling and split-conformal sets:

from bayes_hdc import TemperatureCalibrator, ConformalClassifier

probs = TemperatureCalibrator.create().fit(logits_cal, y_cal).calibrate(logits_test)

conformal = ConformalClassifier.create(alpha=0.1).fit(probs_cal, y_cal)
sets      = conformal.predict_set(probs)        # (n, k) bool mask
coverage  = conformal.coverage(probs, y_test)   # >= 1-alpha in expectation (marginal)

The scikit-learn wrapper covers classification too — it encodes internally and slots into pipelines, cross_val_score, and GridSearchCV unchanged:

from bayes_hdc.sklearn import HDClassifier

HDClassifier(encoder="kernel").fit(X_train, y_train).predict_proba(X_test)

Benchmarks

Standard HDC datasets, 5 seeds, both encoders tuned with the same bandwidth search on identical splits (UCI-HAR uses the official subject-disjoint split). Full protocol and the anomaly table: BENCHMARKS.md.

Dataset bayes-hdc accuracy TorchHD accuracy (tuned) bayes-hdc ECE, raw → calibrated Coverage @ α=0.1
ISOLET 0.895 ± 0.004 0.882 ± 0.006 0.845 → 0.022 0.901
UCI-HAR 0.849 ± 0.006 0.871 ± 0.005 0.633 → 0.031 0.904
EMG gestures 0.944 ± 0.014 0.892 ± 0.005 0.618 → 0.045 0.947

Accuracy is competitive — ahead on two, behind on one — and the right columns are the point: calibrated probabilities and coverage at the target, which the deterministic libraries don't provide. Every number reproduces from a committed script with embedded provenance (make bench-canonical).

In the HDC library landscape

The deterministic substrate (eight VSA models: BSC, MAP, HRR, FHRR, BSBC, CGR, MCR, VTB) is comparable to TorchHD and HoloVec; the differentiation is the probabilistic and uncertainty-quantification layer.

Library Backend VSA models Probabilistic / UQ Differentiable
TorchHD PyTorch 8 partial
HoloVec NumPy / PyTorch / JAX 8 partial
hdlib NumPy generic
vsapy NumPy 6
NengoSPA Nengo (spiking) 3
bayes-hdc JAX 8 Gaussian/Dirichlet HVs, conformal classifier + regressor + anomaly detector end-to-end

Design rationale and per-primitive paper attributions: DESIGN.md · docs/LITERATURE_AUDIT.md.

Examples

emg_gesture_recognition.py sEMG gestures with calibrated per-gesture probabilities
anomaly_detection_intrusion.py network intrusion flags at a guaranteed FP rate
vision_action_policy.py vision-action policy with per-DOF conformal intervals and abstention
kanerva_example.py "What's the Dollar of Mexico?" role-filler analogy

Sixteen more in examples/, and two worked tutorials in tutorials/.

Status

Alpha (0.5.0a1): the API may shift before 1.0. 666 tests at 93% line coverage run on Ubuntu and macOS across Python 3.9–3.13 on every push; tests verify the VSA algebraic laws on randomized inputs, gradient correctness against finite differences, and the conformal coverage and FDR guarantees directly. Sharp edges: GPU/TPU paths are tested in CI on CPU only, the variational-training API is the most likely to change, and bayes_hdc.sklearn needs scikit-learn installed separately.

Pure Python on top of jax + numpy; no compiled extensions.

Contributing

Good first issues are scoped and mentored; setup and style live in CONTRIBUTING.md. Questions and show-and-tell go in Discussions. If the library is useful to you, consider starring the repo — it genuinely helps others find it.

Citing

@software{bayeshdc2026,
  author  = {Singh, Rajdeep},
  title   = {bayes-hdc: Calibrated, Differentiable Hyperdimensional Computing in JAX},
  url     = {https://github.com/rlogger/bayes-hdc},
  version = {0.5.0a1},
  year    = {2026}
}

Or use the "Cite this repository" button (backed by CITATION.cff).

License

MIT. See also: JAX · TorchHD · awesome-jax · Kleyko et al.'s HDC/VSA surveys.

Packages

 
 
 

Contributors

Languages