Skip to content

Conversation

@GunaPalanivel
Copy link
Contributor

@GunaPalanivel GunaPalanivel commented Dec 18, 2025

Related Issues

Part of #2605

Description

This PR exposes the refresh parameter to relevant methods in OpenSearchDocumentStore, allowing users to control when index changes become visible to search operations.

Motivation

OpenSearch is a "near real-time" search engine where changes are not immediately visible after write/update/delete operations. Previously, users had to use time.sleep() workarounds to ensure consistency. This change exposes the underlying refresh parameter, giving users explicit control over this behavior.

Changes

New RefreshType parameter added to the following methods:

Method Default Value
write_documents / write_documents_async "wait_for"
delete_documents / delete_documents_async "wait_for"
delete_all_documents / delete_all_documents_async True
delete_by_filter / delete_by_filter_async "wait_for"
update_by_filter / update_by_filter_async "wait_for"

Parameter values:

  • True: Force refresh immediately after the operation
  • False: Do not refresh (better performance for bulk operations)
  • "wait_for": Wait for the next refresh cycle (default, ensures read-your-writes consistency)

Test Updates

Updated integration tests to use refresh=True instead of time.sleep(), making tests more reliable and faster.

Additional Fixes

  • Fixed delete_all_documents_async to set wait_for_completion=True, ensuring the async delete operation completes before returning

How was this tested?

  • Existing integration tests updated to use the new refresh parameter
  • All tests pass with the new implementation

Add configurable refresh parameter to write, delete, and update methods
in OpenSearchDocumentStore.

This allows users to control when index changes become visible to search
operations, enabling read-your-writes consistency without relying on
time.sleep() workarounds.

Methods updated:
- write_documents / write_documents_async
- delete_documents / delete_documents_async
- delete_all_documents / delete_all_documents_async
- delete_by_filter / delete_by_filter_async
- update_by_filter / update_by_filter_async

The refresh parameter accepts:
- True: Force immediate refresh
- False: No refresh (best for bulk performance)
- 'wait_for': Wait for next refresh cycle (default)

Additional fix:
- Fixed delete_all_documents_async to set wait_for_completion=True,
  ensuring the async delete operation completes before returning

Closes deepset-ai#2065
@GunaPalanivel GunaPalanivel requested a review from a team as a code owner December 18, 2025 16:33
@GunaPalanivel GunaPalanivel requested review from vblagoje and removed request for a team December 18, 2025 16:33
@github-actions github-actions bot added integration:opensearch type:documentation Improvements or additions to documentation labels Dec 18, 2025
@anakin87
Copy link
Member

Thanks for opening a separate PR. Let's put this on hold and first agree on #2622

@anakin87
Copy link
Member

@GunaPalanivel when you have time, feel free to adapt this PR based on #2622 🙂

@GunaPalanivel
Copy link
Contributor Author

@anakin87

I'm working on it 🙂

@GunaPalanivel
Copy link
Contributor Author

GunaPalanivel commented Dec 19, 2025

@anakin87 Based on #2622 , I've adapted this PR for OpenSearch:

Main changes:

  • Split refresh parameter types: Literal[True, False, "wait_for"] for write/delete methods, bool for delete_by_query/update_by_query operations (since they don't support "wait_for")
  • Set refresh="wait_for" as default for write_documents and delete_documents
  • Kept refresh=True for delete_all_documents and filter-based operations
  • Added proper OpenSearch API doc links to all refresh parameters
  • Updated tests to use refresh=True instead of time.sleep() hacks

Let me know if I missed anything or got something wrong, happy to fix! 🙂

@GunaPalanivel
Copy link
Contributor Author

Hi @anakin87, just a gentle follow-up on this PR when you get a moment.

All checks are passing, and I’m happy to make any changes if needed.

A review would really help me move forward and pick up the next issues.

@vblagoje
Copy link
Member

vblagoje commented Jan 5, 2026

@GunaPalanivel coordinate with @davidsbatista as he's doing work on OpenSearchDocumentStore, there is a merge conflict most likely due to that

@davidsbatista
Copy link
Contributor

@GunaPalanivel, thanks for the contribution! Please sync with the latest main and fix the conflicts; they should be easy. In any case, let me know if you need help.

@anakin87 anakin87 self-requested a review January 5, 2026 14:36
@anakin87
Copy link
Member

anakin87 commented Jan 5, 2026

I'll review it soon (once conflicts are fixed).

- Sync calls: always wait_for_completion=True (to get result dict with deleted count)
- Async calls: wait_for_completion=True only if refresh=True
- Logic: wait_for_completion = (not is_async) or refresh
@davidsbatista davidsbatista changed the title feat: expose refresh parameter in OpenSearchDocumentStore feat: expose refresh parameter in OpenSearchDocumentStore Jan 6, 2026
Align with ElasticSearch behavior as suggested by @davidsbatista:
- Set wait_for_completion = not is_async (True only for sync calls)
- This ensures sync calls get the result dict for logging deleted count
- Async calls do not wait, improving performance
- Centralizes simple logic in _prepare_delete_all_request
Both sync and async delete_all_documents need to wait for completion
so documents are actually deleted before the function returns
The is_async parameter is no longer needed since wait_for_completion
is always True for delete_all_documents operations
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me, aligned with Elasticsearch.

@davidsbatista could you take a final look?

Copy link
Contributor

@davidsbatista davidsbatista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks again @GunaPalanivel for one more contribution! We appreciated it!

@anakin87
Copy link
Member

anakin87 commented Jan 7, 2026

@davidsbatista in case you merge and release a new version, let's use 5.0.0 since this is technically a breaking change.

@anakin87 anakin87 changed the title feat: expose refresh parameter in OpenSearchDocumentStore feat!: expose refresh parameter in OpenSearchDocumentStore Jan 7, 2026
@davidsbatista davidsbatista merged commit c36fbfa into deepset-ai:main Jan 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:opensearch type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants