Skip to content

fix(search): wrap group_ids OR clause in parentheses for BM25 queries#1280

Closed
giulio-leone wants to merge 2 commits intogetzep:mainfrom
giulio-leone:fix/issue-1249-bm25-group-ids-filter
Closed

fix(search): wrap group_ids OR clause in parentheses for BM25 queries#1280
giulio-leone wants to merge 2 commits intogetzep:mainfrom
giulio-leone:fix/issue-1249-bm25-group-ids-filter

Conversation

@giulio-leone
Copy link

@giulio-leone giulio-leone commented Feb 28, 2026

Summary

Fixes #1249

When searching with multiple group_ids, the BM25 fulltext query filter was missing parentheses around the OR clause. Due to Lucene operator precedence (AND binds tighter than OR), only the last group_id was effectively filtered.

Root Cause

# Before (buggy):
group_id:"g1" OR group_id:"g2" OR group_id:"g3" AND (hello)
# Parsed as: g1 OR g2 OR (g3 AND hello)  ← only g3 is filtered!

# After (fixed):
(group_id:"g1" OR group_id:"g2" OR group_id:"g3") AND (hello)
# Parsed correctly: all group_ids are filtered

Changes

  • graphiti_core/search/search_utils.py (fulltext_query): Wrap the group_ids OR filter in parentheses before appending AND
  • graphiti_core/driver/neo4j/operations/search_ops.py (_build_neo4j_fulltext_query): Same parenthesisation fix for the Neo4j-specific fulltext query builder
  • tests/utils/search/search_utils_test.py: Added regression tests covering single group_id, multiple group_ids, no group_ids, and empty group_ids list

@danielchalef
Copy link
Member

danielchalef commented Feb 28, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@giulio-leone giulio-leone force-pushed the fix/issue-1249-bm25-group-ids-filter branch from 835d4f9 to 31898ca Compare February 28, 2026 14:40
@giulio-leone
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

danielchalef added a commit that referenced this pull request Feb 28, 2026
@giulio-leone
Copy link
Author

Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks!

@giulio-leone giulio-leone force-pushed the fix/issue-1249-bm25-group-ids-filter branch from 31898ca to 26d784e Compare March 1, 2026 00:36
Copilot AI review requested due to automatic review settings March 1, 2026 00:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix Lucene operator-precedence issues in BM25 fulltext queries when filtering by multiple group_ids, by ensuring the OR’d group_id clause is parenthesized before appending the main query with AND.

Changes:

  • Wrap the group_id:"..." OR ... filter clause in parentheses when building Neo4j fulltext (Lucene) queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

giulio-leone added a commit to giulio-leone/graphiti that referenced this pull request Mar 1, 2026
The same Lucene operator precedence bug fixed in _build_neo4j_fulltext_query
also existed in search_utils.fulltext_query(). Without parentheses, multi-group
queries like "group_id:a OR group_id:b AND (terms)" evaluate incorrectly as
"group_id:a OR (group_id:b AND (terms))".

Also adds 7 regression tests covering single/multiple/none/empty group_ids
for both fulltext_query and _build_neo4j_fulltext_query.

Refs: getzep#1280
@giulio-leone
Copy link
Author

All CI checks pass. Fixes BM25 query precedence bug with group_ids. Includes 7 regression tests. Ready for review.

@giulio-leone
Copy link
Author

Hi! Gentle ping — this PR is rebased, CI passes, and ready for review. Happy to address any feedback. Thanks!

giulio-leone and others added 2 commits March 9, 2026 15:10
Without parentheses, the Lucene query 'group_id:"1" OR group_id:"2"
OR group_id:"3" AND (search terms)' only applied the last group_id
due to AND having higher precedence than OR.

Wrap the group_ids filter in parentheses so the full query becomes:
'(group_id:"1" OR group_id:"2" OR group_id:"3") AND (search terms)'

Fixes getzep#1249

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The same Lucene operator precedence bug fixed in _build_neo4j_fulltext_query
also existed in search_utils.fulltext_query(). Without parentheses, multi-group
queries like "group_id:a OR group_id:b AND (terms)" evaluate incorrectly as
"group_id:a OR (group_id:b AND (terms))".

Also adds 7 regression tests covering single/multiple/none/empty group_ids
for both fulltext_query and _build_neo4j_fulltext_query.

Refs: getzep#1280
@giulio-leone giulio-leone force-pushed the fix/issue-1249-bm25-group-ids-filter branch from 09c5ea1 to d6a11ec Compare March 9, 2026 14:10
@giulio-leone
Copy link
Author

Closing in favor of #1311 which addresses the same BM25 fulltext query parentheses bug with an identical fix. Keeping one PR to avoid duplicate review burden.

@getzep getzep locked and limited conversation to collaborators Mar 9, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Missing parenthesis for BM25 search method query only searching with the last group_id setted

3 participants