-
Notifications
You must be signed in to change notification settings - Fork 25.3k
ESQL: Add documents_found
and values_loaded
(#125631)
#130029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds `documents_found` and `values_loaded` to the to the ESQL response: ```json { "took" : 194, "is_partial" : false, "documents_found" : 100000, "values_loaded" : 200000, "columns" : [ { "name" : "a", "type" : "long" }, { "name" : "b", "type" : "long" } ], "values" : [[10, 1]] } ``` These are cheap enough to collect that we can do it for every query and return it with every response. It's small, but it still gives you a reasonable sense of how much work Elasticsearch had to go through to perform the query. I've also added these two fields to the driver profile and task status: ```json "drivers" : [ { "description" : "data", "cluster_name" : "runTask", "node_name" : "runTask-0", "start_millis" : 1742923173077, "stop_millis" : 1742923173087, "took_nanos" : 9557014, "cpu_nanos" : 9091340, "documents_found" : 5, <---- THESE "values_loaded" : 15, <---- THESE "iterations" : 6, ... ``` These are at a high level and should be easy to reason about. We'd like to extract this into a "show me how difficult this running query is" API one day. But today, just plumbing it into the debugging output is good. Any `Operator` can claim to "find documents" or "load values" by overriding a method on its `Operator.Status` implementation: ```java /** * The number of documents found by this operator. Most operators * don't find documents and will return {@code 0} here. */ default long documentsFound() { return 0; } /** * The number of values loaded by this operator. Most operators * don't load values and will return {@code 0} here. */ default long valuesLoaded() { return 0; } ``` In this PR all of the `LuceneOperator`s declare that each `position` they emit is a "document found" and the `ValuesSourceValuesSourceReaderOperator` says each value it makes is a "value loaded". That's pretty pretty much true. The `LuceneCountOperator` and `LuceneMinMaxOperator` sort of pretend that the count/min/max that they emit is a "document" - but that's good enough to give you a sense of what's going on. It's *like* document.
Documentation preview: |
Also needs a manual review from me to make sure the backport is truly just what it should be. It wasn't clean at all and I had to make a lot of modifications. I have to double check those are sane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems right to me modulo the three things I found. I'll fix those in a moment.
threadPool.getThreadContext().addResponseHeader(AsyncExecutionId.ASYNC_EXECUTION_IS_RUNNING_HEADER, "?0"); | ||
if (task instanceof EsqlQueryTask asyncTask && request.keepOnCompletion()) { | ||
String asyncExecutionId = asyncTask.getExecutionId().getEncoded(); | ||
threadPool.getThreadContext().addResponseHeader(AsyncExecutionId.ASYNC_EXECUTION_ID_HEADER, asyncExecutionId); | ||
return new EsqlQueryResponse( | ||
columns, | ||
result.pages(), | ||
result.completionInfo().documentsFound(), | ||
result.completionInfo().documentsFound(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong. And it's wrong in main too!
} | ||
default -> throw new IllegalArgumentException(); | ||
}; | ||
} | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in main. nuking.
@idegtiarenko, could you have a look at this and double check it against the original PR? It looks right to me, but I'd appreciate a second set of eyes. And you are the one who needs this backport in so you get to suffer a little. |
This adds
documents_found
andvalues_loaded
to the to the ESQL response:These are cheap enough to collect that we can do it for every query and return it with every response. It's small, but it still gives you a reasonable sense of how much work Elasticsearch had to go through to perform the query.
I've also added these two fields to the driver profile and task status:
These are at a high level and should be easy to reason about. We'd like to extract this into a "show me how difficult this running query is" API one day. But today, just plumbing it into the debugging output is good.
Any
Operator
can claim to "find documents" or "load values" by overriding a method on itsOperator.Status
implementation:In this PR all of the
LuceneOperator
s declare that eachposition
they emit is a "document found" and theValuesSourceValuesSourceReaderOperator
says each value it makes is a "value loaded". That's pretty pretty much true. TheLuceneCountOperator
andLuceneMinMaxOperator
sort of pretend that the count/min/max that they emit is a "document" - but that's good enough to give you a sense of what's going on. It's like document.