Add Jina v5 embeddings provider by hanxiao · Pull Request #1257 · getzep/graphiti

hanxiao · 2026-02-22T05:44:27Z

Summary

Add Jina AI as a supported embedding provider for Graphiti knowledge graphs.

jina-embeddings-v5-text-nano (239M params) is the top-ranked embedding model under 500M parameters on MTEB:

MTEB English v2: 71.0 avg
MMTEB multilingual: 65.5 avg
Half the size of comparable models with superior performance across retrieval, STS, and reranking tasks

MMTEB scores vs model size. jina-v5-text models (red) outperform models 2-16x their size. (source)

MTEB English v2 scores. v5-text-nano (239M) achieves 71.0, matching models with 2x+ parameters. (source)

Model	Params	Dim	Max Tokens	MTEB-EN	MMTEB
jina-embeddings-v5-text-nano (default)	239M	768	8192	71.0	65.5
jina-embeddings-v5-text-small	677M	1024	32768	71.7	67.0

Paper: arXiv:2602.15547 | Blog | HuggingFace

Changes

New JinaEmbedder and JinaEmbedderConfig in graphiti_core/embedder/jina.py
Uses Jina's OpenAI-compatible API (https://api.jina.ai/v1/embeddings)
Supports task-specific embeddings via task parameter (retrieval.passage, retrieval.query, etc.)
API key via api_key config or JINA_API_KEY environment variable
Default model: jina-embeddings-v5-text-nano (768d)
Also supports: jina-embeddings-v5-text-small (1024d)
Comprehensive unit tests with mocked API calls

Usage

from graphiti_core.embedder import JinaEmbedder, JinaEmbedderConfig

# Default configuration
config = JinaEmbedderConfig(
    api_key="your-jina-api-key",
    embedding_model="jina-embeddings-v5-text-nano",
    embedding_dim=768,
    task="retrieval.passage"
)
embedder = JinaEmbedder(config)

# For queries
query_config = JinaEmbedderConfig(
    api_key="your-jina-api-key",
    task="retrieval.query"
)
query_embedder = JinaEmbedder(query_config)

danielchalef · 2026-02-22T05:44:38Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

hanxiao · 2026-02-22T06:00:07Z

I have read the CLA Document and I hereby sign the CLA

@contextablemark

…#70) * fix(summary): exclude duplicate edges from node summary generation (getzep#1223) * fix(summary): exclude duplicate edges from node summary generation When resolving extracted edges, edges that match existing edges in the graph were still being passed to node summary generation, causing facts to be duplicated in summaries. Changes: - Update resolve_extracted_edges to return new_edges (non-duplicates) - Update _extract_and_resolve_edges to pass through new_edges - Pass only new_edges to extract_attributes_from_nodes in add_episode - An edge is considered "new" if its resolved UUID matches extracted UUID Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: bump version to 0.27.1 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * feat: simplify extraction pipeline and add batch entity summarization (getzep#1224) * feat(llm): add token usage tracking for LLM calls Add TokenUsageTracker class to track input/output tokens by prompt type during LLM calls. This helps analyze token costs across different operations like extract_nodes, extract_edges, resolve_nodes, etc. Changes: - Add graphiti_core/llm_client/token_tracker.py with TokenUsageTracker - Update LLMClient base class to include token_tracker instance - Update OpenAI base client to capture and record token usage - Add token_tracker property on Graphiti class for easy access - Update podcast_runner.py to print token usage summary after ingestion Usage: client = Graphiti(...) # ... run ingestion ... client.token_tracker.print_summary(sort_by='prompt_name') Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: temporarily disable summary early return optimization Disable the optimization that skips LLM calls when node summary + edge facts is under 2000 characters. This forces all summaries to be generated via LLM for token usage analysis. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Revert "chore: temporarily disable summary early return optimization" This reverts the summary optimization changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: simplify extraction pipeline and add batch entity summarization - Remove chunking code for entity-dense episodes (node_operations.py) - Delete _extract_nodes_chunked, _extract_from_chunk, _merge_extracted_entities - Always use single LLM call for entity extraction - Remove chunking code for edge extraction (edge_operations.py) - Remove MAX_NODES constant and generate_covering_chunks usage - Process all nodes in single LLM call instead of covering subsets - Add batch entity summarization (node_operations.py, extract_nodes.py) - New SummarizedEntity and SummarizedEntities Pydantic models - New extract_summaries_batch prompt for batch processing - New _extract_entity_summaries_batch function - Nodes with short summaries get edge facts appended directly (no LLM) - Only nodes needing LLM summarization are batched together - Simplify edge attribute extraction (extract_edges.py, edge_operations.py) - Remove episode_content from context (attributes from fact only) - Keep reference_time for temporal resolution - Add existing_attributes to preserve/update existing values - Improve edge deduplication prompt (dedupe_edges.py, edge_operations.py) - Use continuous indexing across duplicate and invalidation candidates - Deduplicate invalidation candidates against duplicate candidates - Allow EXISTING FACTS to be both duplicates AND contradicted - Consolidate to single contradicted_facts field - Remove obsolete chunking tests (test_entity_extraction.py) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: bump version to 0.27.2pre1 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add token tracking for Anthropic/Gemini clients and missing tests - Implement token tracking in AnthropicClient._generate_response() and generate_response() using result.usage.input_tokens/output_tokens - Implement token tracking in GeminiClient._generate_response() and generate_response() using response.usage_metadata - Add comprehensive unit tests for TokenUsageTracker class - Add tests for _extract_entity_summaries_batch function covering: - No nodes needing summarization - Short summaries with edge facts - Long summaries requiring LLM - Node filter (should_summarize_node) - Batch multiple nodes - Unknown entity handling - Missing episode and summary Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Update test_node_operations.py for batch summarization API - Remove import of extract_attributes_from_node (function was removed) - Add import of _extract_entity_summaries_batch - Update tests to use new batch summarization API Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add MAX_NODES limit for batch entity summarization - Add MAX_NODES = 30 constant - Partition nodes needing summarization into flights of MAX_NODES - Extract _process_summary_flight helper for processing each flight - Each flight makes a separate LLM call to avoid context overflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Change default OpenAI models to gpt-5-mini Update both DEFAULT_MODEL and DEFAULT_SMALL_MODEL to use gpt-5-mini. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Update podcast_runner.py to use default OpenAI models Remove explicit model configuration to use the default gpt-5-mini models from OpenAIClient. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Revert default model changes to gpt-4.1-mini/nano Restore the original default models instead of gpt-5-mini. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Address PR review comments - Fix unreachable code in _handle_structured_response (check response.refusal) - Process node summary flights in parallel using semaphore_gather - Use case-insensitive name matching for LLM summary responses - Handle duplicate node names by applying summary to all matching nodes - Fix edge case when both edge lists are empty in contradiction processing - Fix potential AttributeError when episode is None in edge attributes - Add tests for flight partitioning and case-insensitive name matching Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(deps): update dependencies to fix dependabot alerts (getzep#1225) Update lock files to address security alerts: - cryptography, cffi, and other security-related packages - Major version bumps for langchain-core and related packages - Minor updates to other dependencies Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> * @contextablemark has signed the CLA in getzep#1227 * @avonian has signed the CLA in getzep#1230 * feat: driver operations architecture redesign (getzep#1232) * feat: add driver operations architecture with abstract interfaces and concrete implementations Introduces a clean operations-based architecture for graph driver operations, replacing inline query logic with abstract interfaces (ABCs) and concrete implementations for both Neo4j and FalkorDB backends. Key changes: - Add QueryExecutor and Transaction ABCs for database-agnostic query execution - Add 11 operations ABCs covering all node, edge, search, and graph maintenance operations - Implement all 11 operations for Neo4j with real transaction commit/rollback - Implement all 11 operations for FalkorDB with RedisSearch fulltext and vecf32 embeddings - Add NodeNamespace and EdgeNamespace convenience wrappers on Graphiti class - Wire operations into Neo4jDriver and FalkorDriver with property accessors - Fix circular import by moving STOPWORDS to graphiti_core.driver.falkordb package - Include design spec documenting architecture decisions and migration plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback - Fix ruff UP037: remove quoted type annotations in driver.py (redundant with `from __future__ import annotations`) - Extract duplicate record parsers into shared record_parsers.py module, eliminating identical _entity_node_from_record, _entity_edge_from_record, _episodic_node_from_record, and _community_node_from_record across 10 files in both Neo4j and FalkorDB operations - Fix MAX_QUERY_LENGTH inconsistency in FalkorDB search_ops build_fulltext_query (was 8000, now uses module constant 128) - Make namespace attributes unconditional with NotImplementedError for drivers that don't implement required operations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make namespace init graceful for drivers missing operations KuzuDriver doesn't implement the new operations interfaces, so the NotImplementedError on init broke Kuzu tests. Now attributes are only set when the driver provides them, and __getattr__ gives a clear error on access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Bump graphiti-core[falkordb] from 0.26.3 to 0.27.1 in /mcp_server (getzep#1231) Bumps [graphiti-core[falkordb]](https://github.com/getzep/graphiti) from 0.26.3 to 0.27.1. - [Release notes](https://github.com/getzep/graphiti/releases) - [Commits](getzep/graphiti@v0.26.3...v0.27.1) --- updated-dependencies: - dependency-name: graphiti-core[falkordb] dependency-version: 0.27.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: implement Neptune and Kuzu driver operations (getzep#1235) * feat: implement Neptune and Kuzu driver operations Extract scattered Neptune and Kuzu logic from nodes.py, edges.py, search_utils.py, and maintenance utilities into structured operations classes, following the same architecture established for Neo4j and FalkorDB in getzep#1232. Each driver now has 11 operations classes: entity_node_ops, episode_node_ops, community_node_ops, saga_node_ops, entity_edge_ops, episodic_edge_ops, community_edge_ops, has_episode_edge_ops, next_episode_edge_ops, search_ops, and graph_ops. Neptune-specific: AOSS fulltext search, comma-separated embeddings, manual cosine similarity, removeKeyFromMap() for saves. Kuzu-specific: RelatesToNode_ intermediate pattern, JSON attributes, QUERY_FTS_INDEX/array_cosine_similarity, BFS depth doubling, Saga/HAS_EPISODE/NEXT_EPISODE schema additions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments for Neptune/Kuzu operations - Extract `_label_propagation` and `Neighbor` to shared `graph_utils.py` module, removing duplication across all 4 driver graph_ops.py files - Extract `_parse_kuzu_entity_node` and `_parse_kuzu_entity_edge` to shared `kuzu/operations/record_parsers.py`, removing duplication across entity_node_ops, entity_edge_ops, graph_ops, and search_ops - Fix UNWIND bug in Kuzu `node_distance_reranker` and `episode_mentions_reranker` (Kuzu doesn't support UNWIND) - Fix `_build_kuzu_fulltext_query` max_query_length calculation bug (`len(group_ids or '')` was meaningless) - Replace inline import + cast pattern with constructor dependency injection for Neptune AOSS access in community_node_ops, search_ops, and graph_ops - Use existing `calculate_cosine_similarity` from `search_utils.py` instead of duplicating it in Neptune search_ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version to 0.28.0 and document graph driver architecture (getzep#1236) * chore: bump version to 0.28.0 and document graph driver architecture Bump graphiti-core to 0.28.0 and update the MCP server dependency to match. Add a new "Graph Driver Architecture" section to the README explaining how the pluggable driver layer works and how to add a new graph database backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: address PR review comments on driver architecture section - Add legacy directories (graph_operations/, search_interface/) and Kuzu record_parsers.py to the diagram, with a "simplified; see source" note - Clarify that the ABC defines operations properties as optional (| None) and concrete drivers override to return non-optional types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove PII from log messages (getzep#1237) * fix: remove PII from log messages Remove entity names, edge facts, and LLM input/output content from log messages to prevent personally identifiable information from leaking into logs. Replace with UUIDs, counts, and structural metadata only. Changes: - edge_operations.py: Remove entity names from WARNING logs, replace full edge objects and name tuples with UUIDs in DEBUG logs - node_operations.py: Remove entity names from WARNING and DEBUG logs, log only UUIDs and counts instead of (name, uuid) tuples - llm_client/client.py: Replace full message content dump in _get_failed_generation_log with message count and role metadata Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: preserve schema metadata and truncated output in logs Address review feedback — the initial PII fix overcorrected by removing non-PII debugging context: - Restore relation types in edge WARNING logs (schema metadata, not PII) - Restore truncated duplicate_name in dedup WARNING (needed for diagnosis) - Restore truncated entity name (first 30 chars) in summary WARNING - Restore truncated raw LLM output (first 500 chars) in failed generation ERROR logs — malformed output is structural, not user content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: extract custom edge attributes on first episode ingestion (getzep#1242) The fast path in resolve_extracted_edge() returned early when no related/existing edges existed, skipping the LLM attribute extraction call. This meant edges created during the first episode never had their custom ontology attributes populated. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace diskcache with sqlite-based cache to resolve CVE (getzep#1238) * fix: replace diskcache with sqlite-based cache to resolve CVE diskcache <= 5.6.3 has an unsafe pickle deserialization vulnerability with no patched version available. Replace it with a minimal SQLite + JSON cache implementation that only stores JSON-serializable data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add thread safety, error handling, and cleanup to LLMCache - Use check_same_thread=False for safe cross-thread SQLite access - Handle JSON serialization/deserialization errors gracefully - Add __del__ for connection cleanup on garbage collection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: upgrade urllib3 to 2.6.3 in examples lock file Fixes decompression-bomb redirect bypass vulnerability (requires >= 2.6.3). The main and mcp_server lock files already had 2.6.3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add unit tests for LLMCache Covers get/set, overwrites, nested values, non-serializable handling, corrupted entry recovery, directory creation, persistence, and cleanup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version to 0.28.1 (getzep#1243) Patch release so that mcp_server and server lockfiles can drop the diskcache transitive dependency once published, resolving dependabot alerts #69 and #70. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * chore: regenerate lockfiles to drop diskcache (getzep#1244) * chore: regenerate lockfiles to drop diskcache dependency Resolves dependabot alerts #69 and #70 (unsafe pickle deserialization in diskcache). Now that graphiti-core 0.28.1 is published without diskcache, all downstream lockfiles can be updated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update server/pyproject.toml Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * @Yifan-233-max has signed the CLA in getzep#1245 * @sprotasovitsky has signed the CLA in getzep#1254 * @hanxiao has signed the CLA in getzep#1257 * docs(sync): add upstream baseline and graphiti_core patch classification * fix(edges): skip malformed RELATES_TO rows in get_between_nodes (#66) * fix(edges): ignore malformed RELATES_TO edges in get_between_nodes Filter out edges missing uuid/group_id/episodes to avoid EntityEdge validation failures when legacy malformed relationships exist between node pairs. * fix(edges): ignore malformed RELATES_TO rows in get_by_node_uuid/get_by_uuids/get_by_group_ids * fix(edges): add null guards to Kuzu get_between_nodes query Add WHERE e.uuid IS NOT NULL AND e.group_id IS NOT NULL AND e.episodes IS NOT NULL to the Kuzu branch of get_between_nodes, matching the Neo4j branch's guards. Addresses review finding on PR #66. (cherry picked from commit ff34e16) * feat: trust-aware retrieval — post-RRF additive boost (#63) * feat: trust-aware retrieval — post-RRF additive boost for promoted facts - Add trust_weight field to SearchConfig (default 0.0 = disabled, backwards compat) - Add rrf_with_trust_boost() and load_trust_scores() to search_utils.py - Add EDGE/NODE_HYBRID_SEARCH_RRF_TRUST recipes - Wire trust boost into edge and node search pipelines in search.py - MCP server: GRAPHITI_TRUST_WEIGHT env var (default 0.15) * fix: review findings — default trust_weight=0.0, skip episode_mentions, flatten double RRF, safe env parsing - H1: MCP TRUST_WEIGHT default 0.15 → 0.0 (opt-in, not opt-out) - H2: Trust boost only for RRF reranker, not episode_mentions (was no-op with overhead) - M1: Remove redundant outer rrf() call in trust branch (use set comprehension for UUIDs) - L1: Try/except on GRAPHITI_TRUST_WEIGHT env var parsing * fix: ruff lint — consistent trust_weight default, strict zip, remove unused import - rrf_with_trust_boost() default trust_weight 0.15 → 0.0 (consistent with SearchConfig) - zip(uuids, rrf_scores, strict=True) per B905 - Remove unused OntologyRegistry import (F401) - Fix import sorting (I001) (cherry picked from commit f93924f) * feat(dedupe): migration-only deterministic edge dedupe mode (#67) * feat(dedupe): add deterministic migration mode to bypass semantic edge dedupe When GRAPHITI_DEDUPE_MODE=deterministic: - keep exact-match fast path - skip LLM duplicate/contradiction resolution - preserve optional attribute extraction Intended for controlled migration backfills where semantic dedupe instability must not block canonical ingestion. * refactor(dedupe): replace env-var GRAPHITI_DEDUPE_MODE with explicit dedupe_mode parameter - Remove os.getenv('GRAPHITI_DEDUPE_MODE') from resolve_extracted_edge - Add dedupe_mode: Literal['semantic','deterministic']='semantic' to: - resolve_extracted_edge(...) - resolve_extracted_edges(...) - _extract_and_resolve_edges(...) [internal helper] - add_episode(...) [public API] - Thread parameter through all call sites - Default remains 'semantic' — no behavior change for existing callsites - add_episode_bulk and add_triplet implicitly keep 'semantic' via default Addresses review finding on PR #67: env-var global bypass too risky. * fix(edge_ops): preserve semantic-mode call signature for resolve_extracted_edge Avoid passing dedupe_mode kwarg in semantic mode so existing monkeypatched tests/callsites without dedupe_mode parameter remain compatible. * docs+api: document dedupe_mode and ontology safety notes for migration hardening (cherry picked from commit 5f4e7c0) * ci(sync): add graphiti_core allowlist guardrail * sync: align maintenance tests with upstream and allow upstream-sync reports * ci: use ubuntu-latest for upstream legacy workflows in fork * ci: disable upstream legacy workflow jobs in fork repo --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Jack Ryan <61809814+jackaldenryan@users.noreply.github.com> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

Add Jina v5 embeddings provider

02ffec6

danielchalef added a commit that referenced this pull request Feb 22, 2026

@hanxiao has signed the CLA in #1257

510bd50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Jina v5 embeddings provider#1257

Add Jina v5 embeddings provider#1257
hanxiao wants to merge 1 commit intogetzep:mainfrom
hanxiao:feat/jina-embeddings

hanxiao commented Feb 22, 2026 •

edited

Loading

Uh oh!

danielchalef commented Feb 22, 2026 •

edited

Loading

Uh oh!

hanxiao commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hanxiao commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Uh oh!

danielchalef commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanxiao commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanxiao commented Feb 22, 2026 •

edited

Loading

danielchalef commented Feb 22, 2026 •

edited

Loading