-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Remote store recovery support for DataFusion indices #20323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/datafusion
Are you sure you want to change the base?
Remote store recovery support for DataFusion indices #20323
Conversation
Signed-off-by: Kamal Nayan <[email protected]>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Comment |
|
❌ Gradle check result for 7d12a6a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for db21058: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for cd942d4: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 9bb11f3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This PR implements the remote store recovery flow for CompositeEngine/DataFusion, enabling indices using non-Lucene data formats (Parquet, Arrow, etc.) to properly recover from remote store after node failures, restarts, and replica promotions.
Key Features
1. Format-Aware Recovery from Remote Store
FileMetadataformat information (e.g., "parquet", "arrow") when syncing segments from remote storesyncSegmentsFromRemoteSegmentStore()uses format-awareFileMetadatakeys instead of string-based keys2. CompositeEngine Empty Store Handling
FileNotFoundExceptionduring recovery when local store is empty (common in remote store recovery scenarios)LocalCheckpointTrackereven when no prior commits exist3. Engine Reset for Recovery
resetEngineToGlobalCheckpoint()to properly initialize CompositeEngine with fresh translog BEFORE creating InternalEngineCatalogSnapshot.userDatabefore serialization for recovery consistency4. Checkpoint Tracking
LastRefreshedCheckpointListenerto CompositeEngine for tracking refresh checkpoints (required by RemoteStoreRefreshListener)lastRefreshedCheckpoint(),currentOngoingRefreshCheckpoint()5. CatalogSnapshot Recovery Support
setUserData()method to CatalogSnapshot for recovery scenariosTest Coverage
Added comprehensive integration test suite
DataFusionRemoteStoreRecoveryTestscovering:Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.