Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve robustness during outages #127

Open
jooola opened this issue Nov 14, 2024 · 1 comment
Open

Improve robustness during outages #127

jooola opened this issue Nov 14, 2024 · 1 comment

Comments

@jooola
Copy link

jooola commented Nov 14, 2024

Yesterday, the Hetzner Cloud API had an outage, and it appears that the docker machine driver did not handle it well.

docker-machine-during-outage

You can see that from 2024-11-13 17:00:00 to 2024-11-14 08:00:00, the amount of requests to /server_types, /images and /locations is unexpectedly high. Also, the amount of requests for single action was also really high.

This leads into rate limits, while waiting for servers to be created.

I see a few possible improvements:

@JonasProgrammer
Copy link
Owner

Free stress testing, I don't see the issue.

Bad jokes aside, sorry this caused you headaches. I'll have a look to get the exponential back-off implemented soon. Regarding error handling in general, I am somewhat torn as to what the best approach is. We do have explicit retry with a set timeout, which was implemented as a feature request. The default behaviour is to fail-fast, as it always was, but it could be changed in a major version bump. When using the CLI this would be what I expect, but I do see the issue with some docker-machine RPC talking applications, such as Rancher, going for a request-storm in fail-fast mode.
As for the caching, I do get the point of them being stable. However, I cannot really be sure in which environment the driver is running. Granted, vanilla docker-machine would be useless without a writeable home directory. But given its PRC nature, it could be run with any kinds of restrictions, so long one takes care it can access provided SSH key files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants