Skip to content

Handle remote peer exceptions #2520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open

Handle remote peer exceptions #2520

wants to merge 5 commits into from

Conversation

ERogalla
Copy link

Before this PR

We have experienced many IO exceptions originating from this undertow message.

After this PR

Don't rethrow on this error.

Comment on lines 84 to 85
&& (!exchange.getConnection().isOpen()
|| throwable.equals(UndertowMessages.MESSAGES.couldNotReadContentLengthData()))) {
Copy link
Member

@pkoenig10 pkoenig10 Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This should probably be pulled out to a helper method so we can give it a descriptive name and avoid cluttering this existing instanceof casing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works, I'd suggest taking another pass at the test to see what's happening in practice!

Creating a new IOException + gathering its stack trace every time we catch any sort of IOException isn't ideal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. @carterkozak you're right. The original test I added did not trigger the added codepath.

.equals(UndertowMessages.MESSAGES
.couldNotReadContentLengthData()
.getMessage())) {
log.info(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to not send an error here? Or should it be a 400 error.
For reference, this error message is pulled directly from UndertowMessages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could not read the full request from the client and client has closed its side of the connection so we can't send a response, but we need to close and clean up our exchange.

Copy link
Contributor

@schlosna schlosna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest #2523 on top of this PR

Comment on lines 83 to 84
} else if (throwable instanceof IOException) {
ioException(exchange, (IOException) throwable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can use instanceof pattern variable to avoid cast:

Suggested change
} else if (throwable instanceof IOException) {
ioException(exchange, (IOException) throwable);
} else if (throwable instanceof IOException ioException) {
ioException(exchange, ioException);

log.info(
"Remote peer closed connection before all data could be read. The request may have been aborted by"
+ " the client",
ioException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ioException);
ioException);
IoUtils.safeClose(exchange.getConnection());

Comment on lines 209 to 213
if (ioException
.getMessage()
.equals(UndertowMessages.MESSAGES
.couldNotReadContentLengthData()
.getMessage())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioException.getMessage() could return null leading to a NullPointerException on this error handling path, so we need to handle that.

Additionally, UndertowMessages.MESSAGES.couldNotReadContentLengthData() creates a new IOException on each invocation and adds the expensive overhead of Throwable#fillInStackTrace(). We should resolve the message once to a static final field.

I've pushed #2523 as suggested changes on top of this branch

Suggested change
if (ioException
.getMessage()
.equals(UndertowMessages.MESSAGES
.couldNotReadContentLengthData()
.getMessage())) {
if (Objects.equals(ioException.getMessage(), COULD_NOT_READ_CONTENT_LENGTH_MESSAGE)) {

.equals(UndertowMessages.MESSAGES
.couldNotReadContentLengthData()
.getMessage())) {
log.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could not read the full request from the client and client has closed its side of the connection so we can't send a response, but we need to close and clean up our exchange.

public void handlesPeerRemoteExceptionsWithoutRethrowing() throws IOException {
exception = UndertowMessages.MESSAGES.couldNotReadContentLengthData();
HttpURLConnection connection = execute();
assertThat(connection.getResponseCode()).isEqualTo(200);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there shouldn't be a response code if the request wasn't fully read and we threw before processing & writing response

Suggested change
assertThat(connection.getResponseCode()).isEqualTo(200);
assertThatThrownBy(() -> connection.getResponseCode())
.isInstanceOf(SocketException.class)
.hasMessageContaining("Unexpected end of file from server");

@@ -199,6 +198,34 @@ private static void deadlineExpiredException(HttpServerExchange exchange, Deadli
exception, exchange, UndertowDeadlineReasonResponseEncodingAdapter.INSTANCE);
}

private static void ioException(HttpServerExchange exchange, IOException ioException) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Move this below error to match the ordering of the cases above.

@carterkozak
Copy link
Contributor

Can you help me understand the core problem that we’re solving? The existing exception itself isn’t unreasonable imo, but it could certainly be handled better. Is that signal causing monitors to fire?

I’m a bit worried that we’re matching a specific failure where we should probably look for a class of failures, as there are other ways that requests can bail out depending on the transport encoding.

@ERogalla
Copy link
Author

ERogalla commented May 5, 2025

This was not causing any monitors to fire, but was just spamming error service logs. I was hoping to clear out any noisy/incorrect error logs that aren't actual errors. We can instruct our team to ignore these errors, but it is occurring on seemingly random service endpoints.

FWIW I'm totally open to close this PR out if this isn't the root cause. I just assumed that this was just some unhandled (but expected) exception.

@carterkozak
Copy link
Contributor

In that case, I'd attempt to cast a more precise net: Instead of updating our exception handling to check messages, we may prefer to wrap the request body conduit/channel such that we can intercept I/O exceptions while reading data and handle them more precisely without impacting the flow of exceptions thrown within service implementations.

* Close exchange on closed client

* Apply suggestions from code review

Co-authored-by: Patrick Koenig <[email protected]>

---------

Co-authored-by: Patrick Koenig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants