-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Only link delayed transport AFTER real transport has called transportReady() #1494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -346,8 +333,18 @@ public void transportShutdown(Status s) { | |||
new Object[] {transport, address}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay for now to mostly lose the statuses for earlier failed transports. However, it would probably be a good idea to add s
to this log output in case someone needs to debug why earlier addresses failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@zhangkun83 LGTM. Travis on Linux got stuck on :grpc-interop-testing:test for 10 minutes without making progress. I restarted the job and it seems to have gotten hung up on it again. I don't expect it is due to a bug in this PR, but this PR may be triggering it to break. I've been having trouble with timeouts exceeding in interop-testing with some of the PRs I've done. I think it is memory-related. |
I logged the test methods. It's stuck in
|
Never mind. The test seems to be stuck in various method each time. test {
testLogging {
exceptionFormat = 'full'
showExceptions true
showCauses true
showStackTraces true
maxGranularity 3
minGranularity 3
events = ["started", "failed", "passed"]
}
maxHeapSize = '2000m'
} Still stuck. |
I am sure this is the kind of deadlock that I worried in #1408. |
This change will make |
349f4cd
to
5f95085
Compare
Rebased onto master so it includes the fix from #1526.
which I believe is unrelated to this change. @ejona86 PTAL |
@zhangkun83, the new commit LGTM. And yeah, |
…Ready(). If TransportSet fails to connect a transport (i.e., transportShutdown() called without transportReady()), TransportSet will automatically schedule reconnection for the next address, unless it has reached the end of the address list, in which case it will fail the delayed transport. This will reduce stream errors caused by bad addresses appearing before good addresses in the resolved address list. Before this change, TransportSet would return the real transport on the first call of obtainActiveTransport(). After this change, it will return the delayed transport instead.
5f95085
to
08c74d4
Compare
Previously TransportSet.shutdown() only shuts down the active transport, which means a transport will not be shutdown if it's not ready yet. This issue was introduced by grpc#1494 that postponed the assignment of the active transport till transport ready.
Resolves #1477
If TransportSet fails to connect a transport (i.e., transportShutdown()
called without transportReady()), TransportSet will automatically
schedule reconnection for the next address, unless it has reached the end
of the address list, in which case it will fail the delayed transport.
This will avoid stream errors caused by bad addresses appearing before
good addresses in the resolved address list.
@ejona86 Please review this PR as a whole. I will squash all commits upon check-in.