-
Notifications
You must be signed in to change notification settings - Fork 66
Fix flaky VRG test by waiting for deletion to complete #2295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…#2294) Signed-off-by: nadavleva <[email protected]>
076b5f8 to
cc46c3a
Compare
nirs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nadavleva thanks for contributing!
| vrgDelete := func() { | ||
| Expect(k8sClient.Delete(ctx, vrg)).To(Succeed()) | ||
| Eventually(vrgGet).Should(MatchError(k8serrors.NewNotFound( | ||
| Eventually(vrgGet, time.Second*10, time.Millisecond*100).Should(MatchError(k8serrors.NewNotFound( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may work but it is not clear based on the the function signature:
Eventually(actualOrCtx any, args ...any) AsyncAssertionWe should use:
Eventually(vrgGet).WithTimeout(10*time.Second).Should(MatchError(k8serrors.NewNotFound(Also it will help to add a comment explaining why we need larger timeout for this test.
I would not bother with the polling interval to keep the code simpler.
Can you share tests results with this timeout, proving that the flakiness is caused by short timeout?
A good example would be to run this test 100 times with this timeout and measure how much time we waited (min, max, avg).
We have lot of falky tests, I wonder this delete timeout within 1 second is not causing other tests to fail. We can increate the default timeout to test this:
https://pkg.go.dev/github.com/onsi/gomega#SetDefaultEventuallyTimeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will run some tests to verify the behavior and have more info on the deletion time interval.
I encountered additional tests that are flaky due to timeout, e.g. #2235
|
@nadavleva I modified the title and commit message to be consistent with our standards:
|
|
First run was good with this change but this this is a flaky test one run does not mean much. I will trigger more runs in the next days. The best way to validated this is to run only this test locally with high count number and and better logging showing how much time we waited for deletion. |
|
Hey @nirs |
This is fine, we cannot fix all flaky tests at once. The good thing is that the tests you change did not fail, but of course 2 runs is not enough for testing flakes. |
Signed-off-by: nadavleva <[email protected]>
This PR updates the VolumeReplicationGroupRecipe Controller test to wait for VRG deletion to complete, addressing flakiness observed in the test.
Fixes #2294