Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index by KirtiJha · Pull Request #3 · KirtiJha/code-historian

KirtiJha · 2026-06-04T12:02:18Z

Summary

Swaps the deterministic hashing embedding behind the zero-config local provider for a real neural sentence-transformer — all-MiniLM-L6-v2 (384-dim) — run locally via Transformers.js. Still zero-config (no API key, no server), but a substantial jump in semantic-search recall. The hashing embedding remains as an automatic fallback.

Stacked on #4. This PR is based on claude/vsix-packaging-AAlOm (the .vsix packaging fix), so the neural runtime deps inherit correct packaging. Review/merge #4 first; GitHub will retarget this to main once #4 lands. The diff below is the neural delta on top of #4.

Verified end-to-end against the real model in this environment: loads in <1s (after a one-time ~23 MB cache), produces 384-dim normalized vectors, and ranks semantically related text above unrelated text.

What changed

Neural backend (src/services/neuralEmbedding.ts) — lazy, cached loader for the Transformers.js pipeline; weights download once to extension storage and are cached offline. Loader is injectable for tests.

Provider (src/services/embedding.ts) — BuiltInEmbeddingProvider prefers neural, falls back to hashing if the ONNX runtime can't load (both 384-dim). Per-call neural failures throw rather than silently mixing two embedding spaces. Adds getBackendSignature() and embedAllPending().

Automatic re-index (correctness) — the embedding backend signature is tracked in the metadata DB; when it changes (hashing → neural, or a provider switch) the vector index is cleared and all embedding_ids reset so the backlog rebuilds cleanly. New helpers: getSetting/setSetting/clearEmbeddingIds.

Non-blocking activation — embedding init + re-index + backfill run in the background; changes captured before the model is ready are caught by the backfill pass.

Packaging (stacked on #4, now completed for the neural runtime):

@xenova/transformers is an optional dependency, external in esbuild (dynamic import), so the extension degrades to hashing if it's absent.
onnxruntime-node ships all platforms' binaries in one ~93 MB package; scripts/package-target.mjs now trims it to the target platform per .vsix.
The unused onnxruntime-web browser/WASM backend (~74 MB of .wasm) is excluded statically. Net: linux-x64 .vsix ≈ 49 MB, verified by inspecting the archive.

Tests

Injected-loader unit tests for the neural path and hashing fallback; metadata getSetting/setSetting/clearEmbeddingIds tests; opt-in real-model test (RUN_NEURAL=1).
✅ Typecheck, lint (0 errors), 53 tests pass (1 opt-in skipped), build.

https://claude.ai/code/session_014bNJaULcYHnDkqUP6HcemQ

…index Replaces the deterministic hashing embedding behind the zero-config `local` provider with a real neural sentence-transformer (all-MiniLM-L6-v2, 384-dim) run locally via Transformers.js, substantially improving semantic-search recall. - neuralEmbedding.ts: lazy, cached loader for the Transformers.js feature- extraction pipeline; weights (~23 MB) download once to the extension storage dir and are cached for offline use. Loader is injectable for tests. - embedding.ts: new BuiltInEmbeddingProvider prefers the neural backend and transparently falls back to the hashing embedding if the ONNX runtime can't load (same 384 dims). Per-call neural errors throw rather than silently mixing embedding spaces. Adds getBackendSignature() and embedAllPending(). - Auto re-index: on activation (and on settings change) the embedding backend signature is compared to the value stored in the metadata DB; if it changed (hashing -> neural, or a provider switch) the vector index is cleared and all embedding ids reset so the backlog is rebuilt cleanly. New metadata helpers: getSetting/setSetting/clearEmbeddingIds. - Activation is never blocked on the model download: embedding init + re-index + backfill all run in the background; capture/embeddings that arrive before the model is ready are simply picked up by the backfill pass. - @xenova/transformers added as an OPTIONAL dependency and marked external in esbuild (loaded at runtime like @lancedb/lancedb), so the extension still works on the hashing fallback if it isn't installed. Tests: injected-loader unit tests for the neural path and hashing fallback, an opt-in real-model test (RUN_NEURAL=1), and metadata settings/clear tests. 53 tests pass. Docs updated.

Now that the neural embedding deps are stacked on the packaging fix, complete the size story for the heavier runtime: - onnxruntime-node ships every platform's native binary in one ~93 MB package (unlike @lancedb/lancedb's per-platform npm packages). scripts/package-target.mjs now temporarily excludes the non-target platforms' binaries from .vscodeignore while packaging (restoring the file afterward), so each .vsix carries only its own platform's binary. The Release workflow uses the script for every target. - onnxruntime-web (the browser/WASM backend) and the duplicate .wasm binaries in @xenova/transformers/dist (~74 MB total) are never loaded in the Node extension host, so they're excluded statically in .vscodeignore (keeping the JS so any require still resolves). Net: a linux-x64 .vsix drops from ~70 MB to ~49 MB, still containing the lancedb, onnxruntime-node and sharp native runtimes for that platform (the model weights are downloaded at runtime, not bundled). Verified by packaging and inspecting the archive.

KirtiJha mentioned this pull request Jun 4, 2026

Fix broken .vsix packaging: ship native deps + platform-specific builds #4

Merged

claude added 2 commits June 4, 2026 12:24

KirtiJha force-pushed the claude/neural-embeddings-AAlOm branch from 803270b to 3d731f4 Compare June 4, 2026 12:31

KirtiJha changed the base branch from main to claude/vsix-packaging-AAlOm June 4, 2026 12:31

KirtiJha deleted the branch claude/vsix-packaging-AAlOm June 4, 2026 12:33

KirtiJha closed this Jun 4, 2026

KirtiJha mentioned this pull request Jun 4, 2026

Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index #5

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index#3

Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index#3
KirtiJha wants to merge 2 commits into
claude/vsix-packaging-AAlOmfrom
claude/neural-embeddings-AAlOm

KirtiJha commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KirtiJha commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KirtiJha commented Jun 4, 2026 •

edited

Loading