Skip to content

Add Jina v5 embeddings provider#1257

Open
hanxiao wants to merge 1 commit intogetzep:mainfrom
hanxiao:feat/jina-embeddings
Open

Add Jina v5 embeddings provider#1257
hanxiao wants to merge 1 commit intogetzep:mainfrom
hanxiao:feat/jina-embeddings

Conversation

@hanxiao
Copy link

@hanxiao hanxiao commented Feb 22, 2026

Summary

Add Jina AI as a supported embedding provider for Graphiti knowledge graphs.

jina-embeddings-v5-text-nano (239M params) is the top-ranked embedding model under 500M parameters on MTEB:

  • MTEB English v2: 71.0 avg
  • MMTEB multilingual: 65.5 avg
  • Half the size of comparable models with superior performance across retrieval, STS, and reranking tasks

MMTEB Multilingual Benchmark

MMTEB scores vs model size. jina-v5-text models (red) outperform models 2-16x their size. (source)

MTEB English Benchmark

MTEB English v2 scores. v5-text-nano (239M) achieves 71.0, matching models with 2x+ parameters. (source)

Model Params Dim Max Tokens MTEB-EN MMTEB
jina-embeddings-v5-text-nano (default) 239M 768 8192 71.0 65.5
jina-embeddings-v5-text-small 677M 1024 32768 71.7 67.0

Paper: arXiv:2602.15547 | Blog | HuggingFace

Changes

  • New JinaEmbedder and JinaEmbedderConfig in graphiti_core/embedder/jina.py
  • Uses Jina's OpenAI-compatible API (https://api.jina.ai/v1/embeddings)
  • Supports task-specific embeddings via task parameter (retrieval.passage, retrieval.query, etc.)
  • API key via api_key config or JINA_API_KEY environment variable
  • Default model: jina-embeddings-v5-text-nano (768d)
  • Also supports: jina-embeddings-v5-text-small (1024d)
  • Comprehensive unit tests with mocked API calls

Usage

from graphiti_core.embedder import JinaEmbedder, JinaEmbedderConfig

# Default configuration
config = JinaEmbedderConfig(
    api_key="your-jina-api-key",
    embedding_model="jina-embeddings-v5-text-nano",
    embedding_dim=768,
    task="retrieval.passage"
)
embedder = JinaEmbedder(config)

# For queries
query_config = JinaEmbedderConfig(
    api_key="your-jina-api-key",
    task="retrieval.query"
)
query_embedder = JinaEmbedder(query_config)

@danielchalef
Copy link
Member

danielchalef commented Feb 22, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@hanxiao
Copy link
Author

hanxiao commented Feb 22, 2026

I have read the CLA Document and I hereby sign the CLA

danielchalef added a commit that referenced this pull request Feb 22, 2026
yhl999 added a commit to yhl999/bicameral that referenced this pull request Feb 22, 2026
…#70)

* fix(summary): exclude duplicate edges from node summary generation (getzep#1223)

* fix(summary): exclude duplicate edges from node summary generation

When resolving extracted edges, edges that match existing edges in the
graph were still being passed to node summary generation, causing facts
to be duplicated in summaries.

Changes:
- Update resolve_extracted_edges to return new_edges (non-duplicates)
- Update _extract_and_resolve_edges to pass through new_edges
- Pass only new_edges to extract_attributes_from_nodes in add_episode
- An edge is considered "new" if its resolved UUID matches extracted UUID

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: bump version to 0.27.1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* feat: simplify extraction pipeline and add batch entity summarization (getzep#1224)

* feat(llm): add token usage tracking for LLM calls

Add TokenUsageTracker class to track input/output tokens by prompt type
during LLM calls. This helps analyze token costs across different
operations like extract_nodes, extract_edges, resolve_nodes, etc.

Changes:
- Add graphiti_core/llm_client/token_tracker.py with TokenUsageTracker
- Update LLMClient base class to include token_tracker instance
- Update OpenAI base client to capture and record token usage
- Add token_tracker property on Graphiti class for easy access
- Update podcast_runner.py to print token usage summary after ingestion

Usage:
  client = Graphiti(...)
  # ... run ingestion ...
  client.token_tracker.print_summary(sort_by='prompt_name')

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: temporarily disable summary early return optimization

Disable the optimization that skips LLM calls when node summary + edge
facts is under 2000 characters. This forces all summaries to be
generated via LLM for token usage analysis.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Revert "chore: temporarily disable summary early return optimization"

This reverts the summary optimization changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: simplify extraction pipeline and add batch entity summarization

- Remove chunking code for entity-dense episodes (node_operations.py)
  - Delete _extract_nodes_chunked, _extract_from_chunk, _merge_extracted_entities
  - Always use single LLM call for entity extraction

- Remove chunking code for edge extraction (edge_operations.py)
  - Remove MAX_NODES constant and generate_covering_chunks usage
  - Process all nodes in single LLM call instead of covering subsets

- Add batch entity summarization (node_operations.py, extract_nodes.py)
  - New SummarizedEntity and SummarizedEntities Pydantic models
  - New extract_summaries_batch prompt for batch processing
  - New _extract_entity_summaries_batch function
  - Nodes with short summaries get edge facts appended directly (no LLM)
  - Only nodes needing LLM summarization are batched together

- Simplify edge attribute extraction (extract_edges.py, edge_operations.py)
  - Remove episode_content from context (attributes from fact only)
  - Keep reference_time for temporal resolution
  - Add existing_attributes to preserve/update existing values

- Improve edge deduplication prompt (dedupe_edges.py, edge_operations.py)
  - Use continuous indexing across duplicate and invalidation candidates
  - Deduplicate invalidation candidates against duplicate candidates
  - Allow EXISTING FACTS to be both duplicates AND contradicted
  - Consolidate to single contradicted_facts field

- Remove obsolete chunking tests (test_entity_extraction.py)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: bump version to 0.27.2pre1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add token tracking for Anthropic/Gemini clients and missing tests

- Implement token tracking in AnthropicClient._generate_response()
  and generate_response() using result.usage.input_tokens/output_tokens
- Implement token tracking in GeminiClient._generate_response()
  and generate_response() using response.usage_metadata
- Add comprehensive unit tests for TokenUsageTracker class
- Add tests for _extract_entity_summaries_batch function covering:
  - No nodes needing summarization
  - Short summaries with edge facts
  - Long summaries requiring LLM
  - Node filter (should_summarize_node)
  - Batch multiple nodes
  - Unknown entity handling
  - Missing episode and summary

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update test_node_operations.py for batch summarization API

- Remove import of extract_attributes_from_node (function was removed)
- Add import of _extract_entity_summaries_batch
- Update tests to use new batch summarization API

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add MAX_NODES limit for batch entity summarization

- Add MAX_NODES = 30 constant
- Partition nodes needing summarization into flights of MAX_NODES
- Extract _process_summary_flight helper for processing each flight
- Each flight makes a separate LLM call to avoid context overflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Change default OpenAI models to gpt-5-mini

Update both DEFAULT_MODEL and DEFAULT_SMALL_MODEL to use gpt-5-mini.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update podcast_runner.py to use default OpenAI models

Remove explicit model configuration to use the default gpt-5-mini models
from OpenAIClient.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Revert default model changes to gpt-4.1-mini/nano

Restore the original default models instead of gpt-5-mini.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address PR review comments

- Fix unreachable code in _handle_structured_response (check response.refusal)
- Process node summary flights in parallel using semaphore_gather
- Use case-insensitive name matching for LLM summary responses
- Handle duplicate node names by applying summary to all matching nodes
- Fix edge case when both edge lists are empty in contradiction processing
- Fix potential AttributeError when episode is None in edge attributes
- Add tests for flight partitioning and case-insensitive name matching

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(deps): update dependencies to fix dependabot alerts (getzep#1225)

Update lock files to address security alerts:
- cryptography, cffi, and other security-related packages
- Major version bumps for langchain-core and related packages
- Minor updates to other dependencies

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>

* @contextablemark has signed the CLA in getzep#1227

* @avonian has signed the CLA in getzep#1230

* feat: driver operations architecture redesign (getzep#1232)

* feat: add driver operations architecture with abstract interfaces and concrete implementations

Introduces a clean operations-based architecture for graph driver operations,
replacing inline query logic with abstract interfaces (ABCs) and concrete
implementations for both Neo4j and FalkorDB backends.

Key changes:
- Add QueryExecutor and Transaction ABCs for database-agnostic query execution
- Add 11 operations ABCs covering all node, edge, search, and graph maintenance operations
- Implement all 11 operations for Neo4j with real transaction commit/rollback
- Implement all 11 operations for FalkorDB with RedisSearch fulltext and vecf32 embeddings
- Add NodeNamespace and EdgeNamespace convenience wrappers on Graphiti class
- Wire operations into Neo4jDriver and FalkorDriver with property accessors
- Fix circular import by moving STOPWORDS to graphiti_core.driver.falkordb package
- Include design spec documenting architecture decisions and migration plan

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review feedback

- Fix ruff UP037: remove quoted type annotations in driver.py
  (redundant with `from __future__ import annotations`)
- Extract duplicate record parsers into shared record_parsers.py module,
  eliminating identical _entity_node_from_record, _entity_edge_from_record,
  _episodic_node_from_record, and _community_node_from_record across
  10 files in both Neo4j and FalkorDB operations
- Fix MAX_QUERY_LENGTH inconsistency in FalkorDB search_ops
  build_fulltext_query (was 8000, now uses module constant 128)
- Make namespace attributes unconditional with NotImplementedError
  for drivers that don't implement required operations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: make namespace init graceful for drivers missing operations

KuzuDriver doesn't implement the new operations interfaces, so the
NotImplementedError on init broke Kuzu tests. Now attributes are only
set when the driver provides them, and __getattr__ gives a clear error
on access.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Bump graphiti-core[falkordb] from 0.26.3 to 0.27.1 in /mcp_server (getzep#1231)

Bumps [graphiti-core[falkordb]](https://github.com/getzep/graphiti) from 0.26.3 to 0.27.1.
- [Release notes](https://github.com/getzep/graphiti/releases)
- [Commits](getzep/graphiti@v0.26.3...v0.27.1)

---
updated-dependencies:
- dependency-name: graphiti-core[falkordb]
  dependency-version: 0.27.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: implement Neptune and Kuzu driver operations (getzep#1235)

* feat: implement Neptune and Kuzu driver operations

Extract scattered Neptune and Kuzu logic from nodes.py, edges.py,
search_utils.py, and maintenance utilities into structured operations
classes, following the same architecture established for Neo4j and
FalkorDB in getzep#1232.

Each driver now has 11 operations classes: entity_node_ops,
episode_node_ops, community_node_ops, saga_node_ops, entity_edge_ops,
episodic_edge_ops, community_edge_ops, has_episode_edge_ops,
next_episode_edge_ops, search_ops, and graph_ops.

Neptune-specific: AOSS fulltext search, comma-separated embeddings,
manual cosine similarity, removeKeyFromMap() for saves.

Kuzu-specific: RelatesToNode_ intermediate pattern, JSON attributes,
QUERY_FTS_INDEX/array_cosine_similarity, BFS depth doubling,
Saga/HAS_EPISODE/NEXT_EPISODE schema additions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review comments for Neptune/Kuzu operations

- Extract `_label_propagation` and `Neighbor` to shared `graph_utils.py`
  module, removing duplication across all 4 driver graph_ops.py files
- Extract `_parse_kuzu_entity_node` and `_parse_kuzu_entity_edge` to
  shared `kuzu/operations/record_parsers.py`, removing duplication
  across entity_node_ops, entity_edge_ops, graph_ops, and search_ops
- Fix UNWIND bug in Kuzu `node_distance_reranker` and
  `episode_mentions_reranker` (Kuzu doesn't support UNWIND)
- Fix `_build_kuzu_fulltext_query` max_query_length calculation bug
  (`len(group_ids or '')` was meaningless)
- Replace inline import + cast pattern with constructor dependency
  injection for Neptune AOSS access in community_node_ops, search_ops,
  and graph_ops
- Use existing `calculate_cosine_similarity` from `search_utils.py`
  instead of duplicating it in Neptune search_ops

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version to 0.28.0 and document graph driver architecture (getzep#1236)

* chore: bump version to 0.28.0 and document graph driver architecture

Bump graphiti-core to 0.28.0 and update the MCP server dependency to
match. Add a new "Graph Driver Architecture" section to the README
explaining how the pluggable driver layer works and how to add a new
graph database backend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: address PR review comments on driver architecture section

- Add legacy directories (graph_operations/, search_interface/) and
  Kuzu record_parsers.py to the diagram, with a "simplified; see source"
  note
- Clarify that the ABC defines operations properties as optional (| None)
  and concrete drivers override to return non-optional types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove PII from log messages (getzep#1237)

* fix: remove PII from log messages

Remove entity names, edge facts, and LLM input/output content from log
messages to prevent personally identifiable information from leaking
into logs. Replace with UUIDs, counts, and structural metadata only.

Changes:
- edge_operations.py: Remove entity names from WARNING logs, replace
  full edge objects and name tuples with UUIDs in DEBUG logs
- node_operations.py: Remove entity names from WARNING and DEBUG logs,
  log only UUIDs and counts instead of (name, uuid) tuples
- llm_client/client.py: Replace full message content dump in
  _get_failed_generation_log with message count and role metadata

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: preserve schema metadata and truncated output in logs

Address review feedback — the initial PII fix overcorrected by removing
non-PII debugging context:

- Restore relation types in edge WARNING logs (schema metadata, not PII)
- Restore truncated duplicate_name in dedup WARNING (needed for diagnosis)
- Restore truncated entity name (first 30 chars) in summary WARNING
- Restore truncated raw LLM output (first 500 chars) in failed generation
  ERROR logs — malformed output is structural, not user content

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: extract custom edge attributes on first episode ingestion (getzep#1242)

The fast path in resolve_extracted_edge() returned early when no
related/existing edges existed, skipping the LLM attribute extraction
call. This meant edges created during the first episode never had
their custom ontology attributes populated.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace diskcache with sqlite-based cache to resolve CVE (getzep#1238)

* fix: replace diskcache with sqlite-based cache to resolve CVE

diskcache <= 5.6.3 has an unsafe pickle deserialization vulnerability
with no patched version available. Replace it with a minimal SQLite +
JSON cache implementation that only stores JSON-serializable data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add thread safety, error handling, and cleanup to LLMCache

- Use check_same_thread=False for safe cross-thread SQLite access
- Handle JSON serialization/deserialization errors gracefully
- Add __del__ for connection cleanup on garbage collection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: upgrade urllib3 to 2.6.3 in examples lock file

Fixes decompression-bomb redirect bypass vulnerability (requires >= 2.6.3).
The main and mcp_server lock files already had 2.6.3.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add unit tests for LLMCache

Covers get/set, overwrites, nested values, non-serializable handling,
corrupted entry recovery, directory creation, persistence, and cleanup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version to 0.28.1 (getzep#1243)

Patch release so that mcp_server and server lockfiles can drop the
diskcache transitive dependency once published, resolving dependabot
alerts #69 and #70.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* chore: regenerate lockfiles to drop diskcache (getzep#1244)

* chore: regenerate lockfiles to drop diskcache dependency

Resolves dependabot alerts #69 and #70 (unsafe pickle deserialization
in diskcache). Now that graphiti-core 0.28.1 is published without
diskcache, all downstream lockfiles can be updated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update server/pyproject.toml

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* @Yifan-233-max has signed the CLA in getzep#1245

* @sprotasovitsky has signed the CLA in getzep#1254

* @hanxiao has signed the CLA in getzep#1257

* docs(sync): add upstream baseline and graphiti_core patch classification

* fix(edges): skip malformed RELATES_TO rows in get_between_nodes (#66)

* fix(edges): ignore malformed RELATES_TO edges in get_between_nodes

Filter out edges missing uuid/group_id/episodes to avoid EntityEdge validation failures
when legacy malformed relationships exist between node pairs.

* fix(edges): ignore malformed RELATES_TO rows in get_by_node_uuid/get_by_uuids/get_by_group_ids

* fix(edges): add null guards to Kuzu get_between_nodes query

Add WHERE e.uuid IS NOT NULL AND e.group_id IS NOT NULL AND e.episodes IS NOT NULL
to the Kuzu branch of get_between_nodes, matching the Neo4j branch's guards.

Addresses review finding on PR #66.

(cherry picked from commit ff34e16)

* feat: trust-aware retrieval — post-RRF additive boost (#63)

* feat: trust-aware retrieval — post-RRF additive boost for promoted facts

- Add trust_weight field to SearchConfig (default 0.0 = disabled, backwards compat)
- Add rrf_with_trust_boost() and load_trust_scores() to search_utils.py
- Add EDGE/NODE_HYBRID_SEARCH_RRF_TRUST recipes
- Wire trust boost into edge and node search pipelines in search.py
- MCP server: GRAPHITI_TRUST_WEIGHT env var (default 0.15)

* fix: review findings — default trust_weight=0.0, skip episode_mentions, flatten double RRF, safe env parsing

- H1: MCP TRUST_WEIGHT default 0.15 → 0.0 (opt-in, not opt-out)
- H2: Trust boost only for RRF reranker, not episode_mentions (was no-op with overhead)
- M1: Remove redundant outer rrf() call in trust branch (use set comprehension for UUIDs)
- L1: Try/except on GRAPHITI_TRUST_WEIGHT env var parsing

* fix: ruff lint — consistent trust_weight default, strict zip, remove unused import

- rrf_with_trust_boost() default trust_weight 0.15 → 0.0 (consistent with SearchConfig)
- zip(uuids, rrf_scores, strict=True) per B905
- Remove unused OntologyRegistry import (F401)
- Fix import sorting (I001)

(cherry picked from commit f93924f)

* feat(dedupe): migration-only deterministic edge dedupe mode (#67)

* feat(dedupe): add deterministic migration mode to bypass semantic edge dedupe

When GRAPHITI_DEDUPE_MODE=deterministic:
- keep exact-match fast path
- skip LLM duplicate/contradiction resolution
- preserve optional attribute extraction

Intended for controlled migration backfills where semantic dedupe instability
must not block canonical ingestion.

* refactor(dedupe): replace env-var GRAPHITI_DEDUPE_MODE with explicit dedupe_mode parameter

- Remove os.getenv('GRAPHITI_DEDUPE_MODE') from resolve_extracted_edge
- Add dedupe_mode: Literal['semantic','deterministic']='semantic' to:
  - resolve_extracted_edge(...)
  - resolve_extracted_edges(...)
  - _extract_and_resolve_edges(...) [internal helper]
  - add_episode(...) [public API]
- Thread parameter through all call sites
- Default remains 'semantic' — no behavior change for existing callsites
- add_episode_bulk and add_triplet implicitly keep 'semantic' via default

Addresses review finding on PR #67: env-var global bypass too risky.

* fix(edge_ops): preserve semantic-mode call signature for resolve_extracted_edge

Avoid passing dedupe_mode kwarg in semantic mode so existing monkeypatched tests/callsites
without dedupe_mode parameter remain compatible.

* docs+api: document dedupe_mode and ontology safety notes for migration hardening

(cherry picked from commit 5f4e7c0)

* ci(sync): add graphiti_core allowlist guardrail

* sync: align maintenance tests with upstream and allow upstream-sync reports

* ci: use ubuntu-latest for upstream legacy workflows in fork

* ci: disable upstream legacy workflow jobs in fork repo

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Jack Ryan <61809814+jackaldenryan@users.noreply.github.com>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants