-
Notifications
You must be signed in to change notification settings - Fork 3.9k
roachtest: adding defensive code in ceph/reef test #148756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With some minor tweaks.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson)
pkg/cmd/roachtest/tests/s3_microceph.go
line 178 at r1 (raw file):
cmd := `sudo radosgw-admin user list` var err error for i := 0; i < 10; i++ {
Do we have a standard retry loop that we use with other tests? There must be some prior art here. If not, this is good enough.
pkg/cmd/roachtest/tests/s3_microceph.go
line 180 at r1 (raw file):
for i := 0; i < 10; i++ { // Sleep for few seconds, then try the command. time.Sleep(2 * time.Second)
Why not put the sleep after the command fails?
It might be worth writing the err to a log as an info with a bit of context.
We have seen sporadic failures in the ceph tests, due to failures in creating users in the ceph object gateway. To address this we are adding code to check that the gateway is up by submitting a read only request, before attempting to add the user. Epic: none Fixes: cockroachdb#148731 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated based on comments.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @BramGruneir and @jeffswenson)
pkg/cmd/roachtest/tests/s3_microceph.go
line 178 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Do we have a standard retry loop that we use with other tests? There must be some prior art here. If not, this is good enough.
Done
pkg/cmd/roachtest/tests/s3_microceph.go
line 180 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Why not put the sleep after the command fails?
It might be worth writing the err to a log as an info with a bit of context.
Reusing RunE
retry loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson)
TFTR bors r+ |
Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches. Issue #148731: branch-release-24.3, branch-release-25.1, branch-release-25.2. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
We have seen sporadic failures in the ceph tests, due to failures in creating users in the ceph object gateway.
To address this we are adding code to check that the gateway is up by submitting a read only request, before attempting to add the user.
Epic: none
Fixes: #148731
Release note: None