Skip to content

Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index#3

Closed
KirtiJha wants to merge 2 commits into
claude/vsix-packaging-AAlOmfrom
claude/neural-embeddings-AAlOm
Closed

Built-in local neural embeddings (all-MiniLM-L6-v2) with auto re-index#3
KirtiJha wants to merge 2 commits into
claude/vsix-packaging-AAlOmfrom
claude/neural-embeddings-AAlOm

Conversation

@KirtiJha

@KirtiJha KirtiJha commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Summary

Swaps the deterministic hashing embedding behind the zero-config local provider for a real neural sentence-transformer — all-MiniLM-L6-v2 (384-dim) — run locally via Transformers.js. Still zero-config (no API key, no server), but a substantial jump in semantic-search recall. The hashing embedding remains as an automatic fallback.

Stacked on #4. This PR is based on claude/vsix-packaging-AAlOm (the .vsix packaging fix), so the neural runtime deps inherit correct packaging. Review/merge #4 first; GitHub will retarget this to main once #4 lands. The diff below is the neural delta on top of #4.

Verified end-to-end against the real model in this environment: loads in <1s (after a one-time ~23 MB cache), produces 384-dim normalized vectors, and ranks semantically related text above unrelated text.

What changed

Neural backend (src/services/neuralEmbedding.ts) — lazy, cached loader for the Transformers.js pipeline; weights download once to extension storage and are cached offline. Loader is injectable for tests.

Provider (src/services/embedding.ts)BuiltInEmbeddingProvider prefers neural, falls back to hashing if the ONNX runtime can't load (both 384-dim). Per-call neural failures throw rather than silently mixing two embedding spaces. Adds getBackendSignature() and embedAllPending().

Automatic re-index (correctness) — the embedding backend signature is tracked in the metadata DB; when it changes (hashing → neural, or a provider switch) the vector index is cleared and all embedding_ids reset so the backlog rebuilds cleanly. New helpers: getSetting/setSetting/clearEmbeddingIds.

Non-blocking activation — embedding init + re-index + backfill run in the background; changes captured before the model is ready are caught by the backfill pass.

Packaging (stacked on #4, now completed for the neural runtime):

  • @xenova/transformers is an optional dependency, external in esbuild (dynamic import), so the extension degrades to hashing if it's absent.
  • onnxruntime-node ships all platforms' binaries in one ~93 MB package; scripts/package-target.mjs now trims it to the target platform per .vsix.
  • The unused onnxruntime-web browser/WASM backend (~74 MB of .wasm) is excluded statically. Net: linux-x64 .vsix ≈ 49 MB, verified by inspecting the archive.

Tests

  • Injected-loader unit tests for the neural path and hashing fallback; metadata getSetting/setSetting/clearEmbeddingIds tests; opt-in real-model test (RUN_NEURAL=1).
  • ✅ Typecheck, lint (0 errors), 53 tests pass (1 opt-in skipped), build.

https://claude.ai/code/session_014bNJaULcYHnDkqUP6HcemQ

claude added 2 commits June 4, 2026 12:24
…index

Replaces the deterministic hashing embedding behind the zero-config `local`
provider with a real neural sentence-transformer (all-MiniLM-L6-v2, 384-dim)
run locally via Transformers.js, substantially improving semantic-search recall.

- neuralEmbedding.ts: lazy, cached loader for the Transformers.js feature-
  extraction pipeline; weights (~23 MB) download once to the extension storage
  dir and are cached for offline use. Loader is injectable for tests.
- embedding.ts: new BuiltInEmbeddingProvider prefers the neural backend and
  transparently falls back to the hashing embedding if the ONNX runtime can't
  load (same 384 dims). Per-call neural errors throw rather than silently
  mixing embedding spaces. Adds getBackendSignature() and embedAllPending().
- Auto re-index: on activation (and on settings change) the embedding backend
  signature is compared to the value stored in the metadata DB; if it changed
  (hashing -> neural, or a provider switch) the vector index is cleared and all
  embedding ids reset so the backlog is rebuilt cleanly. New metadata helpers:
  getSetting/setSetting/clearEmbeddingIds.
- Activation is never blocked on the model download: embedding init + re-index +
  backfill all run in the background; capture/embeddings that arrive before the
  model is ready are simply picked up by the backfill pass.
- @xenova/transformers added as an OPTIONAL dependency and marked external in
  esbuild (loaded at runtime like @lancedb/lancedb), so the extension still
  works on the hashing fallback if it isn't installed.

Tests: injected-loader unit tests for the neural path and hashing fallback, an
opt-in real-model test (RUN_NEURAL=1), and metadata settings/clear tests. 53
tests pass. Docs updated.
Now that the neural embedding deps are stacked on the packaging fix, complete
the size story for the heavier runtime:

- onnxruntime-node ships every platform's native binary in one ~93 MB package
  (unlike @lancedb/lancedb's per-platform npm packages). scripts/package-target.mjs
  now temporarily excludes the non-target platforms' binaries from .vscodeignore
  while packaging (restoring the file afterward), so each .vsix carries only its
  own platform's binary. The Release workflow uses the script for every target.
- onnxruntime-web (the browser/WASM backend) and the duplicate .wasm binaries in
  @xenova/transformers/dist (~74 MB total) are never loaded in the Node extension
  host, so they're excluded statically in .vscodeignore (keeping the JS so any
  require still resolves).

Net: a linux-x64 .vsix drops from ~70 MB to ~49 MB, still containing the
lancedb, onnxruntime-node and sharp native runtimes for that platform (the model
weights are downloaded at runtime, not bundled). Verified by packaging and
inspecting the archive.
@KirtiJha KirtiJha force-pushed the claude/neural-embeddings-AAlOm branch from 803270b to 3d731f4 Compare June 4, 2026 12:31
@KirtiJha KirtiJha changed the base branch from main to claude/vsix-packaging-AAlOm June 4, 2026 12:31
@KirtiJha KirtiJha deleted the branch claude/vsix-packaging-AAlOm June 4, 2026 12:33
@KirtiJha KirtiJha closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants