Skip to content

Skip search shards with INDEX_REFRESH_BLOCK #129132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/

package org.elasticsearch.search;

import org.elasticsearch.action.admin.indices.readonly.AddIndexBlockRequest;
import org.elasticsearch.action.admin.indices.readonly.TransportAddIndexBlockAction;
import org.elasticsearch.action.search.ClosePointInTimeRequest;
import org.elasticsearch.action.search.OpenPointInTimeRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.TransportClosePointInTimeAction;
import org.elasticsearch.action.search.TransportOpenPointInTimeAction;
import org.elasticsearch.cluster.metadata.IndexMetadata;
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.core.TimeValue;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.PointInTimeBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.test.ESIntegTestCase;

import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertHitCount;

public class SearchWithIndexBlocksIT extends ESIntegTestCase {

public void testSearchIndexWithIndexRefreshBlock() {
createIndex("test");

var addIndexBlockRequest = new AddIndexBlockRequest(IndexMetadata.APIBlock.REFRESH, "test");
client().execute(TransportAddIndexBlockAction.TYPE, addIndexBlockRequest).actionGet();
Comment on lines +34 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refresh block should be added automatically to newly created indices as long as they have replicas and the "use refresh block" setting is enabled in the node setting. We should remove the ability to add the refresh block through the Add Index Block API.


indexRandom(
true,
prepareIndex("test").setId("1").setSource("field", "value"),
prepareIndex("test").setId("2").setSource("field", "value"),
prepareIndex("test").setId("3").setSource("field", "value"),
prepareIndex("test").setId("4").setSource("field", "value"),
prepareIndex("test").setId("5").setSource("field", "value"),
prepareIndex("test").setId("6").setSource("field", "value")
);

assertHitCount(prepareSearch().setQuery(QueryBuilders.matchAllQuery()), 0);
}

public void testSearchMultipleIndicesEachWithAnIndexRefreshBlock() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be folded into a single test, where one or more indices are randomly created, most of some with replicas but other without replicas, and then allocate zero or more search shards and check the expected results, finally assigning all search shards and check the results again.

createIndex("test");
createIndex("test2");

var addIndexBlockRequest = new AddIndexBlockRequest(IndexMetadata.APIBlock.REFRESH, "test", "test2");
client().execute(TransportAddIndexBlockAction.TYPE, addIndexBlockRequest).actionGet();

indexRandom(
true,
prepareIndex("test").setId("1").setSource("field", "value"),
prepareIndex("test").setId("2").setSource("field", "value"),
prepareIndex("test").setId("3").setSource("field", "value"),
prepareIndex("test").setId("4").setSource("field", "value"),
prepareIndex("test").setId("5").setSource("field", "value"),
prepareIndex("test").setId("6").setSource("field", "value"),
prepareIndex("test2").setId("1").setSource("field", "value"),
prepareIndex("test2").setId("2").setSource("field", "value"),
prepareIndex("test2").setId("3").setSource("field", "value"),
prepareIndex("test2").setId("4").setSource("field", "value"),
prepareIndex("test2").setId("5").setSource("field", "value"),
prepareIndex("test2").setId("6").setSource("field", "value")
);

assertHitCount(prepareSearch().setQuery(QueryBuilders.matchAllQuery()), 0);
}

public void testSearchMultipleIndicesWithOneIndexRefreshBlock() {
createIndex("test");
createIndex("test2");

// Only block test
var addIndexBlockRequest = new AddIndexBlockRequest(IndexMetadata.APIBlock.REFRESH, "test");
client().execute(TransportAddIndexBlockAction.TYPE, addIndexBlockRequest).actionGet();

indexRandom(
true,
prepareIndex("test").setId("1").setSource("field", "value"),
prepareIndex("test").setId("2").setSource("field", "value"),
prepareIndex("test").setId("3").setSource("field", "value"),
prepareIndex("test").setId("4").setSource("field", "value"),
prepareIndex("test").setId("5").setSource("field", "value"),
prepareIndex("test").setId("6").setSource("field", "value"),
prepareIndex("test2").setId("1").setSource("field", "value"),
prepareIndex("test2").setId("2").setSource("field", "value"),
prepareIndex("test2").setId("3").setSource("field", "value"),
prepareIndex("test2").setId("4").setSource("field", "value"),
prepareIndex("test2").setId("5").setSource("field", "value"),
prepareIndex("test2").setId("6").setSource("field", "value")
);

// We should get test2 results (not blocked)
assertHitCount(prepareSearch().setQuery(QueryBuilders.matchAllQuery()), 6);
}

public void testOpenPITWithIndexRefreshBlock() {
createIndex("test");

var addIndexBlockRequest = new AddIndexBlockRequest(IndexMetadata.APIBlock.REFRESH, "test");
client().execute(TransportAddIndexBlockAction.TYPE, addIndexBlockRequest).actionGet();

indexRandom(
true,
prepareIndex("test").setId("1").setSource("field", "value"),
prepareIndex("test").setId("2").setSource("field", "value"),
prepareIndex("test").setId("3").setSource("field", "value"),
prepareIndex("test").setId("4").setSource("field", "value"),
prepareIndex("test").setId("5").setSource("field", "value"),
prepareIndex("test").setId("6").setSource("field", "value")
);

BytesReference pitId = null;
try {
OpenPointInTimeRequest openPITRequest = new OpenPointInTimeRequest("test").keepAlive(TimeValue.timeValueSeconds(10))
.allowPartialSearchResults(true);
pitId = client().execute(TransportOpenPointInTimeAction.TYPE, openPITRequest).actionGet().getPointInTimeId();
SearchRequest searchRequest = new SearchRequest().source(
new SearchSourceBuilder().pointInTimeBuilder(new PointInTimeBuilder(pitId).setKeepAlive(TimeValue.timeValueSeconds(10)))
);
assertHitCount(client().search(searchRequest), 0);
} finally {
if (pitId != null) {
client().execute(TransportClosePointInTimeAction.TYPE, new ClosePointInTimeRequest(pitId)).actionGet();
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,12 @@ private void runCoordinatorRewritePhase() {
assert assertSearchCoordinationThread();
final List<SearchShardIterator> matchedShardLevelRequests = new ArrayList<>();
for (SearchShardIterator searchShardIterator : shardsIts) {
if (searchShardIterator.prefiltered() == false && searchShardIterator.skip()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand this is what actually skips the shards being searched. Why is this done here in the CanMatchPreFilterSearchPhase? My understanding is that we don't always use this phase, e.g. "shouldPreFilterSearchShards" returns false for all searches that are not QUERY_THEN_FETCH (and other cases). Wouldn't we still run into 503s for those cases?

// This implies the iterator was skipped due to an index level block,
// not a remote can-match run.
continue;
}

final CanMatchNodeRequest canMatchNodeRequest = new CanMatchNodeRequest(
request,
searchShardIterator.getOriginalIndices().indicesOptions(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ public final class SearchShardIterator implements Comparable<SearchShardIterator

/**
* Creates a {@link SearchShardIterator} instance that iterates over a subset of the given shards
* this the a given <code>shardId</code>.
* for a given <code>shardId</code>.
*
* @param clusterAlias the alias of the cluster where the shard is located
* @param shardId shard id of the group
Expand All @@ -54,6 +54,28 @@ public SearchShardIterator(@Nullable String clusterAlias, ShardId shardId, List<

/**
* Creates a {@link SearchShardIterator} instance that iterates over a subset of the given shards
* for a given <code>shardId</code>.
*
* @param clusterAlias the alias of the cluster where the shard is located
* @param shardId shard id of the group
* @param shards shards to iterate
* @param originalIndices the indices that the search request originally related to (before any rewriting happened)
* @param skip if true, then this group won't have matches (due to an index level block),
* and it can be safely skipped from the search
*/
public SearchShardIterator(
@Nullable String clusterAlias,
ShardId shardId,
List<ShardRouting> shards,
OriginalIndices originalIndices,
boolean skip
) {
this(clusterAlias, shardId, shards.stream().map(ShardRouting::currentNodeId).toList(), originalIndices, null, null, false, skip);
}

/**
* Creates a {@link SearchShardIterator} instance that iterates over a subset of the given shards
* for a given <code>shardId</code>.
*
* @param clusterAlias the alias of the cluster where the shard is located
* @param shardId shard id of the group
Expand All @@ -62,7 +84,8 @@ public SearchShardIterator(@Nullable String clusterAlias, ShardId shardId, List<
* @param searchContextId the point-in-time specified for this group if exists
* @param searchContextKeepAlive the time interval that data nodes should extend the keep alive of the point-in-time
* @param prefiltered if true, then this group already executed the can_match phase
* @param skip if true, then this group won't have matches, and it can be safely skipped from the search
* @param skip if true, then this group won't have matches (due to can match, or an index level block),
* and it can be safely skipped from the search
*/
public SearchShardIterator(
@Nullable String clusterAlias,
Expand All @@ -83,7 +106,6 @@ public SearchShardIterator(
assert searchContextKeepAlive == null || searchContextId != null;
this.prefiltered = prefiltered;
this.skip = skip;
assert skip == false || prefiltered : "only prefiltered shards are skip-able";
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ public class TransportSearchAction extends HandledTransportAction<SearchRequest,
Property.NodeScope
);

private static final OriginalIndices SKIPPED_INDICES = new OriginalIndices(Strings.EMPTY_ARRAY, IndicesOptions.strictExpandOpen());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can understand, this is some unique "Marker" instance for the result map. Can you add comments along that line?


private final ThreadPool threadPool;
private final ClusterService clusterService;
private final TransportService transportService;
Expand Down Expand Up @@ -233,6 +235,10 @@ private Map<String, OriginalIndices> buildPerIndexOriginalIndices(
for (String index : indices) {
if (hasBlocks) {
blocks.indexBlockedRaiseException(projectState.projectId(), ClusterBlockLevel.READ, index);
if (blocks.hasIndexBlock(projectState.projectId(), index, IndexMetadata.INDEX_REFRESH_BLOCK)) {
res.put(index, SKIPPED_INDICES);
continue;
}
}

String[] aliases = indexNameExpressionResolver.allIndexAliases(projectState.metadata(), index, indicesAndAliases);
Expand Down Expand Up @@ -588,7 +594,7 @@ public void onFailure(Exception e) {}
);
}

static void adjustSearchType(SearchRequest searchRequest, boolean singleShard) {
static void adjustSearchType(SearchRequest searchRequest, boolean oneOrZeroValidShards) {
// if there's a kNN search, always use DFS_QUERY_THEN_FETCH
if (searchRequest.hasKnnSearch()) {
searchRequest.searchType(DFS_QUERY_THEN_FETCH);
Expand All @@ -603,7 +609,7 @@ static void adjustSearchType(SearchRequest searchRequest, boolean singleShard) {
}

// optimize search type for cases where there is only one shard group to search on
if (singleShard) {
if (oneOrZeroValidShards) {
// if we only have one group, then we always want Q_T_F, no need for DFS, and no need to do THEN since we hit one shard
searchRequest.searchType(QUERY_THEN_FETCH);
}
Expand Down Expand Up @@ -1304,7 +1310,8 @@ private void executeSearch(

Map<String, Float> concreteIndexBoosts = resolveIndexBoosts(searchRequest, projectState.cluster());

adjustSearchType(searchRequest, shardIterators.size() == 1);
boolean oneOrZeroValidShards = shardIterators.size() == 1 || allOrAllButOneSkipped(shardIterators);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify this logic or at least rename it?
We originally seem to want to adjust the search type if we only have one "valid" shard iterator. What does that mean if we have shardIterators.size() but that shard is marked as "skipped" here? Would it even make sense to adjust the type then?

I would prefer to rewrite "allOrAllButOneSkipped" into something like "onlyOneValid" or similar that would include the "shardIterators.size() == 1" condition and in addition return "false" as soon as we found >1 non-skipped iterator.
I'm not entirely sure what we should for:

  • shardIterators.size() == 0 (that would not lead to search type adjustment currently)
  • shardIterators.size() => 1 but all are skipped

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading "adjustSearchType" more closely, I think we don't need that adjustment unless we have exactly one non-skipped shard iterator.

adjustSearchType(searchRequest, oneOrZeroValidShards);

final DiscoveryNodes nodes = projectState.cluster().nodes();
BiFunction<String, String, Transport.Connection> connectionLookup = buildConnectionLookup(
Expand Down Expand Up @@ -1337,6 +1344,30 @@ private void executeSearch(
);
}

/**
* Determines if all, or all but one, iterators are skipped.
* (At this point, iterators may be marked as skipped due to index level blockers).
* We expect skipped iteators to be unlikely, so returning fast after we see more
* than one "not skipped" is an intended optimization.
*
* @param searchShardIterators all the shard iterators derived from indices being searched
* @return true if all of them are already skipped, or only one is not skipped
*/
private boolean allOrAllButOneSkipped(List<SearchShardIterator> searchShardIterators) {
int notSkippedCount = 0;

for (SearchShardIterator searchShardIterator : searchShardIterators) {
if (searchShardIterator.skip() == false) {
notSkippedCount++;
if (notSkippedCount > 1) {
return false;
}
}
}

return true;
}

Executor asyncSearchExecutor(final String[] indices) {
boolean seenSystem = false;
boolean seenCritical = false;
Expand Down Expand Up @@ -1889,7 +1920,13 @@ List<SearchShardIterator> getLocalShardsIterator(
final ShardId shardId = shardRouting.shardId();
OriginalIndices finalIndices = originalIndices.get(shardId.getIndex().getName());
assert finalIndices != null;
list[i++] = new SearchShardIterator(clusterAlias, shardId, shardRouting.getShardRoutings(), finalIndices);
list[i++] = new SearchShardIterator(
clusterAlias,
shardId,
shardRouting.getShardRoutings(),
finalIndices,
finalIndices == SKIPPED_INDICES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is in the "getLocalShardsIterator" code branch, however there is another one for PIT ("getLocalShardsIteratorFromPointInTime"), has this been considered as getting a similar "skip" treatment or doesn't this apply because we don't expect any recently started shards there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question for "getRemoteShardsIteratorFromPointInTime"

);
}
// the returned list must support in-place sorting, so this is the most memory efficient we can do here
return Arrays.asList(list);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,8 @@ public enum APIBlock implements Writeable {
READ("read", INDEX_READ_BLOCK, Property.ServerlessPublic),
WRITE("write", INDEX_WRITE_BLOCK, Property.ServerlessPublic),
METADATA("metadata", INDEX_METADATA_BLOCK, Property.ServerlessPublic),
READ_ONLY_ALLOW_DELETE("read_only_allow_delete", INDEX_READ_ONLY_ALLOW_DELETE_BLOCK);
READ_ONLY_ALLOW_DELETE("read_only_allow_delete", INDEX_READ_ONLY_ALLOW_DELETE_BLOCK),
REFRESH("refresh", INDEX_REFRESH_BLOCK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow blocking refreshes on indices after they have been created. Can we revert this please?


final String name;
final String settingName;
Expand Down
Loading