Skip to content

[core][python] Add global index search modes#8255

Draft
JingsongLi wants to merge 10 commits into
apache:masterfrom
JingsongLi:codex/vector-raw-fallback
Draft

[core][python] Add global index search modes#8255
JingsongLi wants to merge 10 commits into
apache:masterfrom
JingsongLi:codex/vector-raw-fallback

Conversation

@JingsongLi

@JingsongLi JingsongLi commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Add global-index.search-mode as the freshness/performance switch for global-index queries. It defaults to fast for index-only reads, supports full to use snapshot nextRowId plus index coverage before scanning raw data, and supports detail to scan data file metadata for exact unindexed ranges caused by updates or rewrites.

Changes

  • Replace the unreleased global-index.fast-search option with global-index.search-mode = fast | full | detail in Java, Python, and generated docs.
  • Make scalar global-index scans and vector search honor the three search modes.
  • In full mode, use snapshot nextRowId and global-index row-id coverage to avoid planning all data files unless an uncovered range exists.
  • In detail mode, scan data file metadata to detect exact unindexed row ranges, including invalidation caused by file rewrites or updates.
  • Carry snapshot nextRowId through Java and Python vector scan plans so vector raw-data search can use the lightweight full path.
  • Update Java and Python tests for default fast mode, full-mode freshness, filtered unindexed scans, and detail-mode exact range detection.

Testing

  • git diff --check
  • rg -n "fast-search|global-index\\.fast-search|GLOBAL_INDEX_FAST_SEARCH|globalIndexFastSearch|global_index_fast_search|slowSearch|slow search|FastSearch" paimon-api paimon-core paimon-python docs/docs docs/generated
  • python -m compileall paimon-python/pypaimon/common/options/core_options.py paimon-python/pypaimon/read/scanner/file_scanner.py paimon-python/pypaimon/table/source/vector_search_read.py paimon-python/pypaimon/table/source/vector_search_scan.py
  • python -m pytest paimon-python/pypaimon/tests/global_index_test.py::PlanSnapshotFetchRegressionTest::test_search_mode_detail_filters_unindexed_rows_exactly paimon-python/pypaimon/tests/vector_search_filter_test.py::VectorSearchManySplitsTest::test_search_mode_controls_unindexed_range_scan
  • mvn -pl paimon-api,paimon-core -DskipTests spotless:check
  • mvn -pl paimon-core -am -Pfast-build -DskipTests -DfailIfNoTests=false compile test-compile
  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=VectorSearchBuilderTest#testVectorSearchFullModeScansUnindexedData+testVectorSearchFastModeSkipsUnindexedDataByDefault+testVectorSearchFullModeScansFilteredUnindexedData,BtreeGlobalIndexTableTest#testBTreeGlobalIndexSearchModeControlsUnindexedData test

Notes

global-index.fast-search is intentionally not kept as a compatibility alias because it has not been released.

@JingsongLi JingsongLi changed the title [core] Support raw fallback for vector search [WIP][core] Support raw fallback for vector search Jun 16, 2026
@JingsongLi JingsongLi changed the title [WIP][core] Support raw fallback for vector search [core] Support raw fallback for vector search Jun 16, 2026
@JingsongLi JingsongLi marked this pull request as draft June 16, 2026 13:46
@JingsongLi JingsongLi changed the title [core] Support raw fallback for vector search [core][python] Support raw fallback for vector search Jun 16, 2026
@JingsongLi JingsongLi force-pushed the codex/vector-raw-fallback branch from 658ab46 to baefa9d Compare June 18, 2026 05:43
@JingsongLi JingsongLi changed the title [core][python] Support raw fallback for vector search [core][python] Add global index fast search option Jun 18, 2026
@JingsongLi JingsongLi force-pushed the codex/vector-raw-fallback branch 3 times, most recently from 867c483 to 8b1c6b2 Compare June 19, 2026 03:29
@JingsongLi JingsongLi force-pushed the codex/vector-raw-fallback branch from 23cc0fb to a49d70c Compare June 19, 2026 07:29
@JingsongLi JingsongLi changed the title [core][python] Add global index fast search option [core][python] Add global index search modes Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant