Skip to content

Conversation

@yanjunxiang-google
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google commented Oct 8, 2025

This is to address the issue: #41764, i.e, close the gRPC stream once no further external processing needed. It is also partially discussed in #37088.

Currently the ext_proc gRPC stream is opened when the 1st ProcessingRequest is sent to the ext_proc server. And it is closed during ext_proc filter destruction. This is wasting resource on both Envoy and ext_proc server side. For example, if envoy is configured to only send request headers, the gRPC stream is left open until all the way to the response is processed.

This PR is trying to close the ext_proc gRPC stream once Envoy detects no more external processing needed.

Signed-off-by: Yanjun Xiang <[email protected]>
@yanjunxiang-google
Copy link
Contributor Author

/retest

@yanjunxiang-google
Copy link
Contributor Author

/assign @yanavlasov @tyxia @stevenzzzz

Signed-off-by: Yanjun Xiang <[email protected]>
Signed-off-by: Yanjun Xiang <[email protected]>
Signed-off-by: Yanjun Xiang <[email protected]>
Copy link
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

Signed-off-by: Yanjun Xiang <[email protected]>
Copy link
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hooah. You are fighting a beast here. :)

high level, since this is a behavior change that impacts prod traffic, could you guard this change with a feature flag?

Copy link
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments, pls consider some common cases like(there for sure are more):

  1. CONTINUE_AND_REPLACE finishes one direction's all events.
  2. EOS from trailers implicitly terminates body.
  3. only check if-last-reponse after response is "processed", on error cases, stream will already be closed.

On a high level, if we have gone this far already, let's also consider to wait-for-trailers after the half close been sent. but that could be in another PR I assume.

}
break;
case ProcessingResponse::ResponseCase::kRequestBody:
if (isLastBodyResponse(decoding_state_, *response) && encoding_state_.noExternalProcess()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if trailers are configured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we detect EoS is received with body, so no trailers in this request. Thus even filter is configured to send trailers, the external processing in this direction is completed.

Signed-off-by: Yanjun Xiang <[email protected]>
@botengyao
Copy link
Member

Waiting for addressing comments.

/wait

Signed-off-by: Yanjun Xiang <[email protected]>
Signed-off-by: Yanjun Xiang <[email protected]>
Signed-off-by: Yanjun Xiang <[email protected]>
@repokitteh-read-only
Copy link

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #41425 was synchronize by yanjunxiang-google.

see: more, trace.

Signed-off-by: Yanjun Xiang <[email protected]>
Signed-off-by: Yanjun Xiang <[email protected]>
Copy link
Member

@tyxia tyxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

+1 on runtime guard to protect this change,. The processing mode/ext_proc is getting more and more complicated )

Signed-off-by: Yanjun Xiang <[email protected]>
@yanjunxiang-google
Copy link
Contributor Author

yanjunxiang-google commented Oct 17, 2025

The initial optimization does not apply for a few scenarios listed below, which is TBD in the future:

  1. For BUFFERED and BUFFERED_PARTIAL body mode
  2. During header response processing, if request does not have body but has trailer, and trailer is received, at same time if body sending mode is not NONE, and trailer sending mode is SKIP(unlikely request pattern, corner case)
  3. During header response processing, if request has body and end_of_stream is received with body, at same time if body sending mode is NONE, and trailer sending mode is SEND(unlikely processing mode configuration, corner case).

@yanjunxiang-google
Copy link
Contributor Author

Kind Ping!

@yanjunxiang-google
Copy link
Contributor Author

/retest

Copy link
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

@yanjunxiang-google
Copy link
Contributor Author

/retest

1 similar comment
@yanjunxiang-google
Copy link
Contributor Author

/retest

@yanjunxiang-google
Copy link
Contributor Author

/retest

@tyxia
Copy link
Member

tyxia commented Oct 29, 2025

@yanjunxiang-google Could you please resolve any conversations that you think is already resolved? It will help review.

Thanks!

@yanjunxiang-google
Copy link
Contributor Author

@yanjunxiang-google Could you please resolve any conversations that you think is already resolved? It will help review.

Thanks!

Done, thanks!

@yanjunxiang-google
Copy link
Contributor Author

/retest

2 similar comments
@yanjunxiang-google
Copy link
Contributor Author

/retest

@yanjunxiang-google
Copy link
Contributor Author

/retest

@yanavlasov
Copy link
Contributor

LGTM from me. Will wait for @tyxia and @stevenzzzz reviews.

Copy link
Member

@tyxia tyxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks

I think this PR is a best-effort attempt to close the gRPC stream ASAP, with trade-off of corner case complexity. It will be good to also run it with some load test etc.

@yanjunxiang-google
Copy link
Contributor Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants