-
Notifications
You must be signed in to change notification settings - Fork 2.3k
perf: add FalkorDB HNSW vector indices and fix O(n) fulltext re-match #1287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
c1d0efe
bc33661
2193599
f39e686
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -65,7 +65,7 @@ | |
| from graphiti_core.driver.operations.next_episode_edge_ops import NextEpisodeEdgeOperations | ||
| from graphiti_core.driver.operations.saga_node_ops import SagaNodeOperations | ||
| from graphiti_core.driver.operations.search_ops import SearchOperations | ||
| from graphiti_core.graph_queries import get_fulltext_indices, get_range_indices | ||
| from graphiti_core.graph_queries import get_fulltext_indices, get_range_indices, get_vector_indices | ||
| from graphiti_core.utils.datetime_utils import convert_datetimes_to_strings | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
@@ -292,14 +292,31 @@ async def delete_all_indexes(self) -> None: | |
| f'DROP FULLTEXT INDEX FOR ()-[e:{label}]-() ON (e.{field_name})' | ||
| ) | ||
| ) | ||
| elif 'VECTOR' in index_type: | ||
| if entity_type == 'NODE': | ||
| drop_tasks.append( | ||
| self.execute_query( | ||
| f'DROP VECTOR INDEX FOR (n:{label}) ON (n.{field_name})' | ||
| ) | ||
| ) | ||
| elif entity_type == 'RELATIONSHIP': | ||
| drop_tasks.append( | ||
| self.execute_query( | ||
| f'DROP VECTOR INDEX FOR ()-[e:{label}]-() ON (e.{field_name})' | ||
| ) | ||
| ) | ||
|
|
||
| if drop_tasks: | ||
| await asyncio.gather(*drop_tasks) | ||
|
|
||
| async def build_indices_and_constraints(self, delete_existing=False): | ||
| if delete_existing: | ||
| await self.delete_all_indexes() | ||
| index_queries = get_range_indices(self.provider) + get_fulltext_indices(self.provider) | ||
| index_queries = ( | ||
| get_range_indices(self.provider) | ||
| + get_fulltext_indices(self.provider) | ||
| + get_vector_indices(self.provider) | ||
| ) | ||
|
Comment on lines
+315
to
+319
|
||
| for query in index_queries: | ||
| await self.execute_query(query) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -231,8 +231,8 @@ async def edge_fulltext_search( | |||||||
| """ | ||||||||
| UNWIND $ids as id | ||||||||
| MATCH (n:Entity)-[e:RELATES_TO]->(m:Entity) | ||||||||
| WHERE e.group_id IN $group_ids | ||||||||
| AND id(e)=id | ||||||||
| WHERE e.group_id IN $group_ids | ||||||||
| AND id(e)=id | ||||||||
| """ | ||||||||
| + filter_query | ||||||||
| + """ | ||||||||
|
|
@@ -265,6 +265,34 @@ async def edge_fulltext_search( | |||||||
| ) | ||||||||
| else: | ||||||||
| return [] | ||||||||
| elif driver.provider == GraphProvider.FALKORDB: | ||||||||
| # FalkorDB's queryRelationships returns the actual relationship object, | ||||||||
| # so use startNode/endNode directly instead of re-matching by uuid (which | ||||||||
| # causes an O(n) scan of all RELATES_TO edges). | ||||||||
| query = ( | ||||||||
| get_relationships_query('edge_name_and_fact', limit=limit, provider=driver.provider) | ||||||||
| + """ | ||||||||
| YIELD relationship AS e, score | ||||||||
| WITH e, score, startNode(e) AS n, endNode(e) AS m | ||||||||
| """ | ||||||||
| + filter_query | ||||||||
| + """ | ||||||||
| RETURN | ||||||||
| """ | ||||||||
| + get_entity_edge_return_query(driver.provider) | ||||||||
| + """ | ||||||||
| ORDER BY score DESC | ||||||||
| LIMIT $limit | ||||||||
| """ | ||||||||
| ) | ||||||||
|
|
||||||||
| records, _, _ = await driver.execute_query( | ||||||||
| query, | ||||||||
| query=fuzzy_query, | ||||||||
| limit=limit, | ||||||||
| routing_='r', | ||||||||
| **filter_params, | ||||||||
| ) | ||||||||
| else: | ||||||||
| query = ( | ||||||||
| get_relationships_query('edge_name_and_fact', limit=limit, provider=driver.provider) | ||||||||
|
|
@@ -410,6 +438,43 @@ async def edge_similarity_search( | |||||||
| ) | ||||||||
| else: | ||||||||
| return [] | ||||||||
| elif driver.provider == GraphProvider.FALKORDB: | ||||||||
| # Use HNSW vector index for O(log n) search instead of brute-force scan. | ||||||||
| # Over-fetch to compensate for post-filtering on group_id, edge_uuids, etc. | ||||||||
| over_fetch_limit = limit * 10 | ||||||||
|
|
||||||||
| post_filter_parts = list(filter_queries) | ||||||||
| post_filter_parts.append('score > $min_score') | ||||||||
| post_filter = ' WHERE ' + ' AND '.join(post_filter_parts) | ||||||||
|
|
||||||||
| query = ( | ||||||||
| 'CALL db.idx.vector.queryRelationships(' | ||||||||
| "'RELATES_TO', 'fact_embedding', $over_fetch_limit, vecf32($search_vector))" | ||||||||
| """ | ||||||||
| YIELD relationship AS e, score | ||||||||
| MATCH (n:Entity)-[e]->(m:Entity) | ||||||||
| WITH DISTINCT e, n, m, score | ||||||||
|
||||||||
| MATCH (n:Entity)-[e]->(m:Entity) | |
| WITH DISTINCT e, n, m, score | |
| WITH e, score, startNode(e) AS n, endNode(e) AS m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edge similarity search re-match — Replaced MATCH (n:Entity)-[e]->(m:Entity) / WITH DISTINCT e, n, m, score with WITH e, score, startNode(e) AS n, endNode(e) AS m, consistent with the fix already applied in edge_fulltext_search.
Copilot
AI
Mar 1, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test coverage gap: the new FalkorDB HNSW branches (vector queries + post-filtering/min_score behavior) aren’t exercised by the existing search tests (they currently skip FalkorDB). Consider adding a FalkorDBLite-backed integration test or a unit test that asserts the generated Cypher + parameters for the FalkorDB provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test coverage — Added 11 unit tests for the FalkorDB HNSW branches in edge_similarity_search, node_similarity_search, and community_similarity_search. Tests verify correct HNSW index queries, startNode/endNode usage, over-fetch limits, group_id filtering, and min_score filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete_all_indexes()now drops VECTOR indexes here, but the FalkorDB GraphMaintenanceOperations implementation (graphiti_core/driver/falkordb/operations/graph_ops.py) still only drops RANGE/FULLTEXT. This duplication can lead to different behavior depending on which API a caller uses; consider updating the operations implementation too or consolidating index management in one place.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete_all_indexes duplication — Added elif 'VECTOR' in index_type: branch to FalkorGraphMaintenanceOperations.delete_all_indexes() in graph_ops.py, matching the pattern already in FalkorDriver.delete_all_indexes().