-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Remove child level directory on refresh for CompositeIndexWriter #20326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughRemoves child-level on-disk directories after CompositeIndexWriter refresh by unwrapping FSDirectory instances to obtain Paths, closing child directories, and deleting their on-disk directories; adds a test asserting no leftover child directories after refresh. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
Comment |
c02676a to
d06276f
Compare
|
❌ Gradle check result for d06276f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
192-228: Consider improving test clarity and robustness.The test verifies that child directories are removed after refresh, but has some areas that could be strengthened:
Line 219: The filter logic with
path.endsWith("extra0") == falseand the- 1adjustment uses unexplained magic values. This makes the test fragile and harder to maintain.Line 219: Style -
attributes.isDirectory() == trueshould be simplified toattributes.isDirectory().The test doesn't verify that child directories were actually created before refresh—it only checks they're gone after. The test could pass vacuously if directories were never created.
🔎 Suggested improvements
public void testChildDirectoryDeletedPostRefresh() throws IOException, InterruptedException { final EngineConfig engineConfig = config(); Path dataPath = engineConfig.getStore().shardPath().resolveIndex(); CompositeIndexWriter compositeIndexWriter = null; try { compositeIndexWriter = new CompositeIndexWriter( engineConfig, createWriter(), newSoftDeletesPolicy(), softDeletesField, indexWriterFactory ); Engine.Index operation = indexForDoc(createParsedDoc("id", null, DEFAULT_CRITERIA)); try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { compositeIndexWriter.addDocuments(operation.docs(), operation.uid()); } operation = indexForDoc(createParsedDoc("id2", null, "testingNewCriteria")); try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { compositeIndexWriter.addDocuments(operation.docs(), operation.uid()); } + // Verify child directories were created + long childDirCountBefore = Files.list(dataPath) + .filter(path -> Files.isDirectory(path) && path.getFileName().toString().startsWith(CHILD_DIRECTORY_PREFIX)) + .count(); + assertTrue("Expected child directories to be created before refresh", childDirCountBefore > 0); + compositeIndexWriter.beforeRefresh(); compositeIndexWriter.afterRefresh(true); - long directoryCount = Files.find(dataPath, 1, (path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false).count() - 1; - // Ensure no child directory is pending here. - assertEquals(0, directoryCount); + // Verify all child directories are removed after refresh + long childDirCountAfter = Files.list(dataPath) + .filter(path -> Files.isDirectory(path) && path.getFileName().toString().startsWith(CHILD_DIRECTORY_PREFIX)) + .count(); + assertEquals("Expected no child directories after refresh", 0, childDirCountAfter); } finally { if (compositeIndexWriter != null) { IOUtils.closeWhileHandlingException(compositeIndexWriter); } } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.mdserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
🔇 Additional comments (3)
CHANGELOG.md (1)
103-103: LGTM!The changelog entry correctly documents the fix for removing child-level directories on refresh.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
22-23: LGTM!The imports are necessary for the new directory cleanup functionality.
Also applies to: 44-44
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
32-33: LGTM!The imports are necessary for filesystem verification in the new test method.
| Path[] childDirectoryPath = directoryToCombine.stream().map(directory -> getLocalFSDirectory(directory).getDirectory()).toArray(Path[]::new); | ||
| IOUtils.closeWhileHandlingException(directoryToCombine); | ||
| IOUtils.rm(childDirectoryPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider more graceful error handling for directory removal failures.
The current flow merges child indexes into the parent, closes directories, then removes them from disk. If IOUtils.rm fails (e.g., due to filesystem permissions or disk issues), the catch block in beforeRefresh triggers a full rollback, causing shard failure even though the merge succeeded and directories are already closed.
This could lead to:
- Unnecessary shard failures from transient filesystem issues
- Orphaned closed directories accumulating on disk if removal consistently fails
Additionally, if getLocalFSDirectory throws during the stream operation on line 546, the directories in directoryToCombine remain open and won't be closed.
Consider:
- Wrapping the path extraction in a try-catch to ensure directories are closed even if path extraction fails
- Using
IOUtils.deleteFilesIgnoringExceptionsinstead ofIOUtils.rmfor removal, or catching and logging removal failures separately rather than triggering full rollback - Alternatively, if directory removal is critical for correctness, document why rollback is necessary when it fails
🔎 Potential safer approach
if (!directoryToCombine.isEmpty()) {
accumulatingIndexWriter.addIndexes(directoryToCombine.toArray(new Directory[0]));
- Path[] childDirectoryPath = directoryToCombine.stream().map(directory -> getLocalFSDirectory(directory).getDirectory()).toArray(Path[]::new);
- IOUtils.closeWhileHandlingException(directoryToCombine);
- IOUtils.rm(childDirectoryPath);
+ Path[] childDirectoryPath = null;
+ try {
+ childDirectoryPath = directoryToCombine.stream()
+ .map(directory -> getLocalFSDirectory(directory).getDirectory())
+ .toArray(Path[]::new);
+ } finally {
+ IOUtils.closeWhileHandlingException(directoryToCombine);
+ }
+
+ if (childDirectoryPath != null) {
+ try {
+ IOUtils.rm(childDirectoryPath);
+ } catch (IOException e) {
+ logger.warn("Failed to remove child directories after merge, may cause disk space accumulation", e);
+ // Consider whether to throw or just log
+ }
+ }
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 546-548, the code maps directoryToCombine to Path[] then closes and
removes them; if getLocalFSDirectory throws the stream mapping will leave
directories open, and if IOUtils.rm fails it currently causes a full rollback.
Wrap the Path extraction in a try/catch/finally so that any exception during
mapping still runs IOUtils.closeWhileHandlingException on directoryToCombine,
and replace the unconditional IOUtils.rm call with either
IOUtils.deleteFilesIgnoringExceptions or a guarded try/catch that logs removal
failures (including exception details) without forcing a rollback; if removal
truly must trigger rollback, add a clear comment documenting that invariant and
rethrow only after logging.
| private FSDirectory getLocalFSDirectory(Directory localDirectory) { | ||
| FSDirectory localFSDirectory; | ||
| if (localDirectory instanceof FSDirectory) { | ||
| localFSDirectory = (FSDirectory) localDirectory; | ||
| } else { | ||
| // In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation. | ||
| localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate()); | ||
| } | ||
|
|
||
| return localFSDirectory; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive checks to prevent ClassCastException.
The method assumes the directory is either FSDirectory or a single-level FilterDirectory wrapping an FSDirectory, but lacks validation. This could fail if:
- The directory is
null - The directory is neither
FSDirectorynorFilterDirectory(line 560 cast fails) FilterDirectorywraps something other thanFSDirectory(line 560 cast fails)- Multiple layers of
FilterDirectoryexist (only one level is unwrapped)
The comment mentions "as per above validation" but no such validation exists.
🔎 Proposed defensive implementation
private FSDirectory getLocalFSDirectory(Directory localDirectory) {
+ if (localDirectory == null) {
+ throw new IllegalArgumentException("localDirectory cannot be null");
+ }
+
FSDirectory localFSDirectory;
if (localDirectory instanceof FSDirectory) {
localFSDirectory = (FSDirectory) localDirectory;
- } else {
- // In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation.
+ } else if (localDirectory instanceof FilterDirectory) {
+ // Unwrap FilterDirectory to get the delegate
+ Directory delegate = ((FilterDirectory) localDirectory).getDelegate();
+ // Recursively unwrap in case of multiple FilterDirectory layers
+ if (delegate instanceof FilterDirectory || delegate instanceof FSDirectory) {
+ localFSDirectory = getLocalFSDirectory(delegate);
+ } else {
+ throw new IllegalArgumentException(
+ "Expected FSDirectory but FilterDirectory wraps: " + delegate.getClass().getName()
+ );
+ }
+ } else {
+ throw new IllegalArgumentException(
+ "Expected FSDirectory or FilterDirectory but got: " + localDirectory.getClass().getName()
+ );
- localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate());
}
return localFSDirectory;
}🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 554 to 564, the method unconditionally casts to FSDirectory and
assumes a single FilterDirectory wrapper; add defensive checks: validate
localDirectory is not null, then iteratively unwrap while localDirectory is an
instance of FilterDirectory to reach the delegate, and after unwrapping assert
the resulting object is an instance of FSDirectory; if it is not, throw an
IllegalArgumentException (or another appropriate runtime exception) with a clear
message including the actual class encountered so callers can diagnose the
mismatch.
8d4412f to
9fb528a
Compare
|
❌ Gradle check result for 9fb528a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: RS146BIJAY <[email protected]>
9fb528a to
f871f27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
546-550: Error handling needs improvement for directory cleanup.The current implementation has the issues identified in previous reviews:
- If
getLocalFSDirectorythrows during the stream mapping (line 546-547), the directories indirectoryToCombineremain open and won't be closed- If
IOUtils.rmfails (line 550), it triggers a full rollback via the catch block at line 529, causing shard failure even though the merge succeeded and directories are already closedThis can lead to:
- Orphaned closed directories accumulating on disk
- Unnecessary shard failures from transient filesystem issues (permissions, disk space)
Consider wrapping the path extraction in a try-finally block to ensure directories are always closed, and handle removal failures more gracefully (either using
IOUtils.deleteFilesIgnoringExceptionsor logging failures without triggering rollback).Based on previous review feedback that flagged these same concerns.
556-566: Add defensive checks to prevent ClassCastException.The method lacks validation and can fail with
ClassCastExceptionin several scenarios:
- No null check on
localDirectory- No validation before casting to
FilterDirectory(line 562)- No validation that the delegate is actually
FSDirectory(line 562)- Only unwraps one level of
FilterDirectory(multiple layers not handled)- Comment mentions "as per above validation" but no such validation exists
Consider adding:
- Null check at the start
- Validation that localDirectory is either FSDirectory or FilterDirectory, with a clear exception message if not
- Recursive unwrapping to handle multiple FilterDirectory layers
- Validation that the final unwrapped delegate is FSDirectory
Based on previous review feedback that identified these missing defensive checks.
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
192-232: Consider strengthening the test verification.The test validates the core functionality, but could be more robust:
- Line 222: The filter excluding
"extra0"relies on hardcoded internal directory naming, which is brittle- Line 223: Subtracting 1 from the count assumes a specific directory structure that may change
- Missing verification: The test doesn't confirm child directories existed before refresh (only checks they're gone after)
- No document validation: Doesn't verify the documents were successfully indexed/merged
Consider:
- Counting child directories before and after refresh to confirm cleanup actually occurred
- Using a more robust filter (e.g., checking for the
CHILD_DIRECTORY_PREFIXconstant from line 56 of CompositeIndexWriter)- Verifying the documents are accessible after refresh via a DirectoryReader
🔎 Example improvement
+// Count child directories before refresh +long preRefreshCount = Files.find( + dataPath, + 1, + (path, attrs) -> attrs.isDirectory() && path.getFileName().toString().startsWith(BucketedCompositeDirectory.CHILD_DIRECTORY_PREFIX) +).count(); +assertTrue("Expected child directories before refresh", preRefreshCount > 0); + compositeIndexWriter.beforeRefresh(); compositeIndexWriter.afterRefresh(true); -long directoryCount = Files.find( +long postRefreshCount = Files.find( dataPath, 1, - (path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false -).count() - 1; -// Ensure no child directory is pending here. -assertEquals(0, directoryCount); + (path, attrs) -> attrs.isDirectory() && path.getFileName().toString().startsWith(BucketedCompositeDirectory.CHILD_DIRECTORY_PREFIX) +).count(); +assertEquals("Child directories should be cleaned up after refresh", 0, postRefreshCount); + +// Verify documents are still accessible +try (DirectoryReader reader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { + assertEquals("Both documents should be in parent writer", 2, reader.numDocs()); +}
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.mdserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
22-23: LGTM!The new imports are necessary for the directory cleanup functionality introduced in this PR.
Also applies to: 44-44
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)
32-33: LGTM!The imports are required for filesystem verification in the new test.
234-275: LGTM!The concurrency test appropriately validates that the refresh mechanism (including the new directory cleanup logic) handles concurrent indexing operations without deadlocks or crashes. The test structure is sound with proper thread coordination and cleanup.
|
❌ Gradle check result for f871f27: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| if (localDirectory instanceof FSDirectory) { | ||
| localFSDirectory = (FSDirectory) localDirectory; | ||
| } else { | ||
| // In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation. | ||
| localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should assert that it is always an FSDirectory since we use createFSDirectory method from store
| long directoryCount = Files.find( | ||
| dataPath, | ||
| 1, | ||
| (path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Add a comment explaining why extra0 needs to be removed [ExtrasFS]
Description
Remove child level directory during refresh once child level IndexWriter is synced with parent IndexWriter.
Related Issues
#19921
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.