fix(search): wrap group_ids OR clause in parentheses for BM25 queries#1280
fix(search): wrap group_ids OR clause in parentheses for BM25 queries#1280giulio-leone wants to merge 2 commits intogetzep:mainfrom
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
835d4f9 to
31898ca
Compare
|
I have read the CLA Document and I hereby sign the CLA |
|
Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks! |
31898ca to
26d784e
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to fix Lucene operator-precedence issues in BM25 fulltext queries when filtering by multiple group_ids, by ensuring the OR’d group_id clause is parenthesized before appending the main query with AND.
Changes:
- Wrap the
group_id:"..." OR ...filter clause in parentheses when building Neo4j fulltext (Lucene) queries.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The same Lucene operator precedence bug fixed in _build_neo4j_fulltext_query also existed in search_utils.fulltext_query(). Without parentheses, multi-group queries like "group_id:a OR group_id:b AND (terms)" evaluate incorrectly as "group_id:a OR (group_id:b AND (terms))". Also adds 7 regression tests covering single/multiple/none/empty group_ids for both fulltext_query and _build_neo4j_fulltext_query. Refs: getzep#1280
|
All CI checks pass. Fixes BM25 query precedence bug with group_ids. Includes 7 regression tests. Ready for review. |
|
Hi! Gentle ping — this PR is rebased, CI passes, and ready for review. Happy to address any feedback. Thanks! |
Without parentheses, the Lucene query 'group_id:"1" OR group_id:"2" OR group_id:"3" AND (search terms)' only applied the last group_id due to AND having higher precedence than OR. Wrap the group_ids filter in parentheses so the full query becomes: '(group_id:"1" OR group_id:"2" OR group_id:"3") AND (search terms)' Fixes getzep#1249 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The same Lucene operator precedence bug fixed in _build_neo4j_fulltext_query also existed in search_utils.fulltext_query(). Without parentheses, multi-group queries like "group_id:a OR group_id:b AND (terms)" evaluate incorrectly as "group_id:a OR (group_id:b AND (terms))". Also adds 7 regression tests covering single/multiple/none/empty group_ids for both fulltext_query and _build_neo4j_fulltext_query. Refs: getzep#1280
09c5ea1 to
d6a11ec
Compare
|
Closing in favor of #1311 which addresses the same BM25 fulltext query parentheses bug with an identical fix. Keeping one PR to avoid duplicate review burden. |
Summary
Fixes #1249
When searching with multiple
group_ids, the BM25 fulltext query filter was missing parentheses around the OR clause. Due to Lucene operator precedence (AND binds tighter than OR), only the lastgroup_idwas effectively filtered.Root Cause
Changes
graphiti_core/search/search_utils.py(fulltext_query): Wrap the group_ids OR filter in parentheses before appendingANDgraphiti_core/driver/neo4j/operations/search_ops.py(_build_neo4j_fulltext_query): Same parenthesisation fix for the Neo4j-specific fulltext query buildertests/utils/search/search_utils_test.py: Added regression tests covering single group_id, multiple group_ids, no group_ids, and empty group_ids list