Clusterctl upgrade flake #11610
Labels
area/e2e-testing
Issues or PRs related to e2e testing
kind/flake
Categorizes issue or PR as related to a flaky test.
needs-priority
Indicates an issue lacks a `priority/foo` label and requires one.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
Which jobs are flaking?
periodic-cluster-api-e2e-release-1-9
periodic-cluster-api-e2e-mink8s-release-1-8
periodic-cluster-api-e2e-latestk8s-main
Which tests are flaking?
When testing clusterctl upgrades (v0.4=>v1.6=>current) Should create a management cluster and then upgrade all the providers
When testing clusterctl upgrades (v0.3=>v1.5=>current) Should create a management cluster and then upgrade all the providers
Since when has it been flaking?
Looks like maybe for awhile... failures from:
and can see similar patterns in the timeline back in august/september: https://storage.googleapis.com/k8s-triage/index.html?date=2024-09-04&text=Timed%20out%20waiting%20for%20Cluster&job=.*periodic-cluster-api-e2e.*&test=.*clusterctl%20upgrades
Testgrid link
https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/periodic-cluster-api-e2e-latestk8s-main/1870177294711001088
Reason for failure (if possible)
Seems like the issue is that the docker controller fails to start up the load balancer container of the workload cluster due to a port conflict when it tries to provision using v0.3 and v0.4 of clusterctl which causes the DockerCluster to get stuck:
Anything else we need to know?
No response
Label(s) to be applied
/kind flake
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: