-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Wait for instance termination before deleting nodeclaim #1195
chore: Wait for instance termination before deleting nodeclaim #1195
Conversation
Hi @jigisha620. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Pull Request Test Coverage Report for Build 9487162594Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work! I didn't look at the testing, but overall the core code looks good. This is also going to be amazing for measuring how long the instance terminations take if we can matter the status transition time -> delete call time. Does it make sense to add a metric to measure this before actually removing the finalizer?
90c7dc3
to
67b8124
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
8c83b5c
to
5925061
Compare
5925061
to
3d1d85b
Compare
3d1d85b
to
b60d4eb
Compare
b60d4eb
to
2d26023
Compare
2d26023
to
6ac7eaa
Compare
67adb7e
to
9ce1415
Compare
b92e786
to
e997a44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for all of the testing, I would tidy up the description of the test. I should be able to read the description of the test and immediately reason about what it should be doing and validating without having to read through the details of the test itself
1f6cd79
to
54c39e7
Compare
54c39e7
to
d28a984
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/hold Waiting for the E2E tests in the downstream repo in the AWS Provider to pass with these new changes |
/unhold Tests passed and are running in a reasonable time. This should be GTG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/unhold
84087a7
to
2e5f91e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jigisha620, jonathan-innis The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #N/A
Description
Finalizers on nodeClaim and node should not be removed until the underlying instance is deleted to avoid leaking any resources. The current approach relies on retryable error being emitted by cloudProvider.Delete() and continues reconciliation until this error is received. However, that is not an ideal approach. When this approach was tested for AWS provider we found that some instances could take too long to delete.Hence in this PR, instead a status condition
terminating
is added on nodeclaim and if the status exists then we call cloudProvider.Get() to check if the instance is terminated. If it is terminated, then we can remove finalizer from the nodeClaim.How was this change tested?
Tested on my local cluster and ran unit tests
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.