Replies: 1 comment
-
We see reports of similar observations that suddenly, a command takes longer than it should while we cannot pinpoint what is causing the increase in latency. We cannot easily solve this issue as commands can only complete if the previously sent command has been completed. Such a retry approach would require a fresh Redis connection (TCP). In a fully-fledged SSL scenario, I'm not sure that TCP, SSL, and HELLO handshakes would complete quicker than waiting for a command completion. Obviously, if the hanging commands lingers around for a minute, then we might have a different problem, though.
I think this is a reasonable approach to ensure application responsiveness. |
Beta Was this translation helpful? Give feedback.
-
Has the idea of implementing speculative retry for non-mutating commands in cluster mode been considered before? (I couldn't find it in the search.)
It's possible that this may be a very specialized situation, but some read-heavy workloads I'm seeing patterns like this in the 99.9th percentile (concurrency in this diagram is 3 for sake of simplicity):
Basically, the 3rd command is taking a long time for $SOME_REASON, and we have reason to believe that if it were just retried it would complete in a duration on par with other slots. It would be nice to have a setting that would retry it WITHOUT retrying all of the other slots. Of course, the alternative is to manage the process ourselves: set the command timeout to be very low and retry the entire operation.
If the maintainers see any merit in a patch that would accomplish this, I would be happy to contribute it with some guidance.
FWIW, I've spent some time digging around
RedisAdvancedClusterAsyncCommandsImpl
andAsyncCommand
and it appears this would be a non-trivial change.Beta Was this translation helpful? Give feedback.
All reactions