Skip to content

Skip search shards with INDEX_REFRESH_BLOCK #129132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

benchaplin
Copy link
Contributor

#117543 introduced a ClusterBlock which is applied to new indices in Serverless which do not yet have search shards up. We should skip searches for indices with this block in order to avoid meaningless 503s.

@benchaplin benchaplin requested a review from tlrx June 9, 2025 05:56
@benchaplin benchaplin added >non-issue Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations v9.1.0 labels Jun 9, 2025
@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jun 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@benchaplin benchaplin requested a review from a team as a code owner June 9, 2025 14:39
@benchaplin benchaplin force-pushed the skip_search_shards_with_index_block branch from 998cdd5 to 1ecc447 Compare June 9, 2025 17:05
@benchaplin benchaplin removed the request for review from a team June 9, 2025 17:06
@benchaplin benchaplin requested review from a team as code owners June 11, 2025 21:03
@benchaplin benchaplin force-pushed the skip_search_shards_with_index_block branch from e233cc7 to 17706e2 Compare June 11, 2025 21:04
Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. The search part should be reviewed by the ES Search team.

@@ -283,7 +283,8 @@ public enum APIBlock implements Writeable {
READ("read", INDEX_READ_BLOCK, Property.ServerlessPublic),
WRITE("write", INDEX_WRITE_BLOCK, Property.ServerlessPublic),
METADATA("metadata", INDEX_METADATA_BLOCK, Property.ServerlessPublic),
READ_ONLY_ALLOW_DELETE("read_only_allow_delete", INDEX_READ_ONLY_ALLOW_DELETE_BLOCK);
READ_ONLY_ALLOW_DELETE("read_only_allow_delete", INDEX_READ_ONLY_ALLOW_DELETE_BLOCK),
REFRESH("refresh", INDEX_REFRESH_BLOCK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow blocking refreshes on indices after they have been created. Can we revert this please?

Comment on lines +34 to +35
var addIndexBlockRequest = new AddIndexBlockRequest(IndexMetadata.APIBlock.REFRESH, "test");
client().execute(TransportAddIndexBlockAction.TYPE, addIndexBlockRequest).actionGet();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refresh block should be added automatically to newly created indices as long as they have replicas and the "use refresh block" setting is enabled in the node setting. We should remove the ability to add the refresh block through the Add Index Block API.

assertHitCount(prepareSearch().setQuery(QueryBuilders.matchAllQuery()), 0);
}

public void testSearchMultipleIndicesEachWithAnIndexRefreshBlock() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be folded into a single test, where one or more indices are randomly created, most of some with replicas but other without replicas, and then allocate zero or more search shards and check the expected results, finally assigning all search shards and check the results again.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass on the search related side of things and left a few questions and comments.

shardId,
shardRouting.getShardRoutings(),
finalIndices,
finalIndices == SKIPPED_INDICES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is in the "getLocalShardsIterator" code branch, however there is another one for PIT ("getLocalShardsIteratorFromPointInTime"), has this been considered as getting a similar "skip" treatment or doesn't this apply because we don't expect any recently started shards there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question for "getRemoteShardsIteratorFromPointInTime"

@@ -148,6 +148,8 @@ public class TransportSearchAction extends HandledTransportAction<SearchRequest,
Property.NodeScope
);

private static final OriginalIndices SKIPPED_INDICES = new OriginalIndices(Strings.EMPTY_ARRAY, IndicesOptions.strictExpandOpen());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can understand, this is some unique "Marker" instance for the result map. Can you add comments along that line?

@@ -1304,7 +1310,8 @@ private void executeSearch(

Map<String, Float> concreteIndexBoosts = resolveIndexBoosts(searchRequest, projectState.cluster());

adjustSearchType(searchRequest, shardIterators.size() == 1);
boolean oneOrZeroValidShards = shardIterators.size() == 1 || allOrAllButOneSkipped(shardIterators);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify this logic or at least rename it?
We originally seem to want to adjust the search type if we only have one "valid" shard iterator. What does that mean if we have shardIterators.size() but that shard is marked as "skipped" here? Would it even make sense to adjust the type then?

I would prefer to rewrite "allOrAllButOneSkipped" into something like "onlyOneValid" or similar that would include the "shardIterators.size() == 1" condition and in addition return "false" as soon as we found >1 non-skipped iterator.
I'm not entirely sure what we should for:

  • shardIterators.size() == 0 (that would not lead to search type adjustment currently)
  • shardIterators.size() => 1 but all are skipped

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading "adjustSearchType" more closely, I think we don't need that adjustment unless we have exactly one non-skipped shard iterator.

@@ -186,6 +186,12 @@ private void runCoordinatorRewritePhase() {
assert assertSearchCoordinationThread();
final List<SearchShardIterator> matchedShardLevelRequests = new ArrayList<>();
for (SearchShardIterator searchShardIterator : shardsIts) {
if (searchShardIterator.prefiltered() == false && searchShardIterator.skip()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand this is what actually skips the shards being searched. Why is this done here in the CanMatchPreFilterSearchPhase? My understanding is that we don't always use this phase, e.g. "shouldPreFilterSearchShards" returns false for all searches that are not QUERY_THEN_FETCH (and other cases). Wouldn't we still run into 503s for those cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Foundations/Search Catch all for Search Foundations serverless-linked Added by automation, don't add manually Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants