Skip to content

Conversation

@RS146BIJAY
Copy link
Contributor

@RS146BIJAY RS146BIJAY commented Dec 28, 2025

Description

Remove child level directory during refresh once child level IndexWriter is synced with parent IndexWriter.

Related Issues

#19921

Summary by CodeRabbit

  • Bug Fixes

    • Child index directories are now removed during refresh, preventing accumulation of orphaned on-disk directories.
  • Tests

    • Added/updated tests to verify child directories are deleted after refresh and to validate concurrent indexing during refresh cycles.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 28, 2025

📝 Walkthrough

Walkthrough

Removes child-level on-disk directories after CompositeIndexWriter refresh by unwrapping FSDirectory instances to obtain Paths, closing child directories, and deleting their on-disk directories; adds a test asserting no leftover child directories after refresh.

Changes

Cohort / File(s) Summary
Changelog
CHANGELOG.md
Added "Fixed" entry noting removal of child-level directories on refresh for CompositeIndexWriter.
Index writer implementation
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Added imports (FSDirectory, FilterDirectory, java.nio.file.Path); after collecting child Directory instances, added getLocalFSDirectory(Directory) to unwrap FSDirectory (handles FilterDirectory), derive Path[], close child directories, and remove their on-disk directories via IOUtils.rm(Path[]).
Tests
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
Added testChildDirectoryDeletedPostRefresh() which creates docs, calls beforeRefresh()/afterRefresh(true), and asserts no child directories remain using java.nio.file.Files/Path; adjusted placement/structure of testConcurrentIndexingDuringRefresh() and added NIO imports.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through shard paths, soft paws in the night,
I closed every writer and swept folders light.
Merged children gone, not a nest left in view,
The index dreams tidy — fresh, calm, and new. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive The description provides the core change explanation and links to the related issue, but lacks information about testing and omits the required checklist items. Complete the checklist section and confirm whether functionality includes testing and if any API documentation updates are needed.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: removing child-level directories on refresh for CompositeIndexWriter, matching the core objective.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

❌ Gradle check result for d06276f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

192-228: Consider improving test clarity and robustness.

The test verifies that child directories are removed after refresh, but has some areas that could be strengthened:

  1. Line 219: The filter logic with path.endsWith("extra0") == false and the - 1 adjustment uses unexplained magic values. This makes the test fragile and harder to maintain.

  2. Line 219: Style - attributes.isDirectory() == true should be simplified to attributes.isDirectory().

  3. The test doesn't verify that child directories were actually created before refresh—it only checks they're gone after. The test could pass vacuously if directories were never created.

🔎 Suggested improvements
 public void testChildDirectoryDeletedPostRefresh() throws IOException, InterruptedException {
     final EngineConfig engineConfig = config();
     Path dataPath = engineConfig.getStore().shardPath().resolveIndex();
     CompositeIndexWriter compositeIndexWriter = null;
     
     try {
         compositeIndexWriter = new CompositeIndexWriter(
             engineConfig,
             createWriter(),
             newSoftDeletesPolicy(),
             softDeletesField,
             indexWriterFactory
         );
         
         Engine.Index operation = indexForDoc(createParsedDoc("id", null, DEFAULT_CRITERIA));
         try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
             compositeIndexWriter.addDocuments(operation.docs(), operation.uid());
         }
         
         operation = indexForDoc(createParsedDoc("id2", null, "testingNewCriteria"));
         try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
             compositeIndexWriter.addDocuments(operation.docs(), operation.uid());
         }
         
+        // Verify child directories were created
+        long childDirCountBefore = Files.list(dataPath)
+            .filter(path -> Files.isDirectory(path) && path.getFileName().toString().startsWith(CHILD_DIRECTORY_PREFIX))
+            .count();
+        assertTrue("Expected child directories to be created before refresh", childDirCountBefore > 0);
+        
         compositeIndexWriter.beforeRefresh();
         compositeIndexWriter.afterRefresh(true);
         
-        long directoryCount = Files.find(dataPath, 1, (path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false).count() - 1;
-        // Ensure no child directory is pending here.
-        assertEquals(0, directoryCount);
+        // Verify all child directories are removed after refresh
+        long childDirCountAfter = Files.list(dataPath)
+            .filter(path -> Files.isDirectory(path) && path.getFileName().toString().startsWith(CHILD_DIRECTORY_PREFIX))
+            .count();
+        assertEquals("Expected no child directories after refresh", 0, childDirCountAfter);
     }  finally {
         if (compositeIndexWriter != null) {
             IOUtils.closeWhileHandlingException(compositeIndexWriter);
         }
     }
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 696951d and d06276f.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
🔇 Additional comments (3)
CHANGELOG.md (1)

103-103: LGTM!

The changelog entry correctly documents the fix for removing child-level directories on refresh.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

22-23: LGTM!

The imports are necessary for the new directory cleanup functionality.

Also applies to: 44-44

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

32-33: LGTM!

The imports are necessary for filesystem verification in the new test method.

Comment on lines 546 to 548
Path[] childDirectoryPath = directoryToCombine.stream().map(directory -> getLocalFSDirectory(directory).getDirectory()).toArray(Path[]::new);
IOUtils.closeWhileHandlingException(directoryToCombine);
IOUtils.rm(childDirectoryPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Consider more graceful error handling for directory removal failures.

The current flow merges child indexes into the parent, closes directories, then removes them from disk. If IOUtils.rm fails (e.g., due to filesystem permissions or disk issues), the catch block in beforeRefresh triggers a full rollback, causing shard failure even though the merge succeeded and directories are already closed.

This could lead to:

  • Unnecessary shard failures from transient filesystem issues
  • Orphaned closed directories accumulating on disk if removal consistently fails

Additionally, if getLocalFSDirectory throws during the stream operation on line 546, the directories in directoryToCombine remain open and won't be closed.

Consider:

  1. Wrapping the path extraction in a try-catch to ensure directories are closed even if path extraction fails
  2. Using IOUtils.deleteFilesIgnoringExceptions instead of IOUtils.rm for removal, or catching and logging removal failures separately rather than triggering full rollback
  3. Alternatively, if directory removal is critical for correctness, document why rollback is necessary when it fails
🔎 Potential safer approach
 if (!directoryToCombine.isEmpty()) {
     accumulatingIndexWriter.addIndexes(directoryToCombine.toArray(new Directory[0]));
-    Path[] childDirectoryPath = directoryToCombine.stream().map(directory -> getLocalFSDirectory(directory).getDirectory()).toArray(Path[]::new);
-    IOUtils.closeWhileHandlingException(directoryToCombine);
-    IOUtils.rm(childDirectoryPath);
+    Path[] childDirectoryPath = null;
+    try {
+        childDirectoryPath = directoryToCombine.stream()
+            .map(directory -> getLocalFSDirectory(directory).getDirectory())
+            .toArray(Path[]::new);
+    } finally {
+        IOUtils.closeWhileHandlingException(directoryToCombine);
+    }
+    
+    if (childDirectoryPath != null) {
+        try {
+            IOUtils.rm(childDirectoryPath);
+        } catch (IOException e) {
+            logger.warn("Failed to remove child directories after merge, may cause disk space accumulation", e);
+            // Consider whether to throw or just log
+        }
+    }
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 546-548, the code maps directoryToCombine to Path[] then closes and
removes them; if getLocalFSDirectory throws the stream mapping will leave
directories open, and if IOUtils.rm fails it currently causes a full rollback.
Wrap the Path extraction in a try/catch/finally so that any exception during
mapping still runs IOUtils.closeWhileHandlingException on directoryToCombine,
and replace the unconditional IOUtils.rm call with either
IOUtils.deleteFilesIgnoringExceptions or a guarded try/catch that logs removal
failures (including exception details) without forcing a rollback; if removal
truly must trigger rollback, add a clear comment documenting that invariant and
rethrow only after logging.

Comment on lines +554 to +566
private FSDirectory getLocalFSDirectory(Directory localDirectory) {
FSDirectory localFSDirectory;
if (localDirectory instanceof FSDirectory) {
localFSDirectory = (FSDirectory) localDirectory;
} else {
// In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation.
localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate());
}

return localFSDirectory;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add defensive checks to prevent ClassCastException.

The method assumes the directory is either FSDirectory or a single-level FilterDirectory wrapping an FSDirectory, but lacks validation. This could fail if:

  • The directory is null
  • The directory is neither FSDirectory nor FilterDirectory (line 560 cast fails)
  • FilterDirectory wraps something other than FSDirectory (line 560 cast fails)
  • Multiple layers of FilterDirectory exist (only one level is unwrapped)

The comment mentions "as per above validation" but no such validation exists.

🔎 Proposed defensive implementation
 private FSDirectory getLocalFSDirectory(Directory localDirectory) {
+    if (localDirectory == null) {
+        throw new IllegalArgumentException("localDirectory cannot be null");
+    }
+    
     FSDirectory localFSDirectory;
     if (localDirectory instanceof FSDirectory) {
         localFSDirectory = (FSDirectory) localDirectory;
-    } else {
-        // In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation.
+    } else if (localDirectory instanceof FilterDirectory) {
+        // Unwrap FilterDirectory to get the delegate
+        Directory delegate = ((FilterDirectory) localDirectory).getDelegate();
+        // Recursively unwrap in case of multiple FilterDirectory layers
+        if (delegate instanceof FilterDirectory || delegate instanceof FSDirectory) {
+            localFSDirectory = getLocalFSDirectory(delegate);
+        } else {
+            throw new IllegalArgumentException(
+                "Expected FSDirectory but FilterDirectory wraps: " + delegate.getClass().getName()
+            );
+        }
+    } else {
+        throw new IllegalArgumentException(
+            "Expected FSDirectory or FilterDirectory but got: " + localDirectory.getClass().getName()
+        );
-        localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate());
     }
     
     return localFSDirectory;
 }
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 554 to 564, the method unconditionally casts to FSDirectory and
assumes a single FilterDirectory wrapper; add defensive checks: validate
localDirectory is not null, then iteratively unwrap while localDirectory is an
instance of FilterDirectory to reach the delegate, and after unwrapping assert
the resulting object is an instance of FSDirectory; if it is not, throw an
IllegalArgumentException (or another appropriate runtime exception) with a clear
message including the actual class encountered so callers can diagnose the
mismatch.

@RS146BIJAY RS146BIJAY force-pushed the directory-cleanup branch 2 times, most recently from 8d4412f to 9fb528a Compare December 30, 2025 14:15
@github-actions
Copy link
Contributor

❌ Gradle check result for 9fb528a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)

546-550: Error handling needs improvement for directory cleanup.

The current implementation has the issues identified in previous reviews:

  1. If getLocalFSDirectory throws during the stream mapping (line 546-547), the directories in directoryToCombine remain open and won't be closed
  2. If IOUtils.rm fails (line 550), it triggers a full rollback via the catch block at line 529, causing shard failure even though the merge succeeded and directories are already closed

This can lead to:

  • Orphaned closed directories accumulating on disk
  • Unnecessary shard failures from transient filesystem issues (permissions, disk space)

Consider wrapping the path extraction in a try-finally block to ensure directories are always closed, and handle removal failures more gracefully (either using IOUtils.deleteFilesIgnoringExceptions or logging failures without triggering rollback).

Based on previous review feedback that flagged these same concerns.


556-566: Add defensive checks to prevent ClassCastException.

The method lacks validation and can fail with ClassCastException in several scenarios:

  1. No null check on localDirectory
  2. No validation before casting to FilterDirectory (line 562)
  3. No validation that the delegate is actually FSDirectory (line 562)
  4. Only unwraps one level of FilterDirectory (multiple layers not handled)
  5. Comment mentions "as per above validation" but no such validation exists

Consider adding:

  • Null check at the start
  • Validation that localDirectory is either FSDirectory or FilterDirectory, with a clear exception message if not
  • Recursive unwrapping to handle multiple FilterDirectory layers
  • Validation that the final unwrapped delegate is FSDirectory

Based on previous review feedback that identified these missing defensive checks.

🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

192-232: Consider strengthening the test verification.

The test validates the core functionality, but could be more robust:

  1. Line 222: The filter excluding "extra0" relies on hardcoded internal directory naming, which is brittle
  2. Line 223: Subtracting 1 from the count assumes a specific directory structure that may change
  3. Missing verification: The test doesn't confirm child directories existed before refresh (only checks they're gone after)
  4. No document validation: Doesn't verify the documents were successfully indexed/merged

Consider:

  • Counting child directories before and after refresh to confirm cleanup actually occurred
  • Using a more robust filter (e.g., checking for the CHILD_DIRECTORY_PREFIX constant from line 56 of CompositeIndexWriter)
  • Verifying the documents are accessible after refresh via a DirectoryReader
🔎 Example improvement
+// Count child directories before refresh
+long preRefreshCount = Files.find(
+    dataPath,
+    1, 
+    (path, attrs) -> attrs.isDirectory() && path.getFileName().toString().startsWith(BucketedCompositeDirectory.CHILD_DIRECTORY_PREFIX)
+).count();
+assertTrue("Expected child directories before refresh", preRefreshCount > 0);
+
 compositeIndexWriter.beforeRefresh();
 compositeIndexWriter.afterRefresh(true);
 
-long directoryCount = Files.find(
+long postRefreshCount = Files.find(
     dataPath,
     1,
-    (path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false
-).count() - 1;
-// Ensure no child directory is pending here.
-assertEquals(0, directoryCount);
+    (path, attrs) -> attrs.isDirectory() && path.getFileName().toString().startsWith(BucketedCompositeDirectory.CHILD_DIRECTORY_PREFIX)
+).count();
+assertEquals("Child directories should be cleaned up after refresh", 0, postRefreshCount);
+
+// Verify documents are still accessible
+try (DirectoryReader reader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
+    assertEquals("Both documents should be in parent writer", 2, reader.numDocs());
+}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9fb528a and f871f27.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

22-23: LGTM!

The new imports are necessary for the directory cleanup functionality introduced in this PR.

Also applies to: 44-44

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)

32-33: LGTM!

The imports are required for filesystem verification in the new test.


234-275: LGTM!

The concurrency test appropriately validates that the refresh mechanism (including the new directory cleanup logic) handles concurrent indexing operations without deadlocks or crashes. The test structure is sound with proper thread coordination and cleanup.

@github-actions
Copy link
Contributor

❌ Gradle check result for f871f27: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Comment on lines +558 to +563
if (localDirectory instanceof FSDirectory) {
localFSDirectory = (FSDirectory) localDirectory;
} else {
// In this case it should be a FilterDirectory wrapped over FSDirectory as per above validation.
localFSDirectory = (FSDirectory) (((FilterDirectory) localDirectory).getDelegate());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should assert that it is always an FSDirectory since we use createFSDirectory method from store

long directoryCount = Files.find(
dataPath,
1,
(path, attributes) -> attributes.isDirectory() == true && path.endsWith("extra0") == false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Add a comment explaining why extra0 needs to be removed [ExtrasFS]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants