Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cncf-hosted gha runners #2538

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

jeefy
Copy link

@jeefy jeefy commented Mar 17, 2025

Description

CNCF has hosted ephemeral GitHub runners in Oracle that we're wanting projects to use rather than the GitHub hosted ones, which are now incur a cost to use. This PR is currently a WIP to work through any tests that break or dependencies that may be missing. <3

Please direct any questions to myself, @krook and @RobertKielty

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gaocegege for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jeefy jeefy force-pushed the cncf-gha-runners branch from 4a8708a to d5fb853 Compare March 17, 2025 17:33
@jeefy jeefy force-pushed the cncf-gha-runners branch from d5fb853 to 2232e4e Compare March 17, 2025 17:45
@google-oss-prow google-oss-prow bot added size/M and removed size/XS labels Mar 17, 2025
@jeefy jeefy changed the title WIP: Use cncf-hosted gha runners Use cncf-hosted gha runners Mar 17, 2025
Signed-off-by: Jeffrey Sica <[email protected]>
@jeefy jeefy force-pushed the cncf-gha-runners branch from 2232e4e to 88f3098 Compare March 17, 2025 17:54
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great that we have access to the Oracle runners 🎉
Thank you for this @jeefy!
Should we update this issue and update other repos (e.g. mpi-operator): kubeflow/community#829 ?
cc @kubeflow/wg-training-leads @kubeflow/kubeflow-steering-committee

@jeefy
Copy link
Author

jeefy commented Mar 17, 2025 via email

@tenzen-y
Copy link
Member

If we move all CI jobs to CNCF hosted runner, we need to create DinD container images so that we can perform Kind cluster during CI.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Mar 18, 2025

If we move all CI jobs to CNCF hosted runner, we need to create DinD container images so that we can perform Kind cluster during CI.

If i understand this correctly, it is only for the more expensive larger GHA runners, not the default ones provided by GitHub. So we only need to migrate workflows that benefit from larger runners or VMs.

@tenzen-y
Copy link
Member

If we move all CI jobs to CNCF hosted runner, we need to create DinD container images so that we can perform Kind cluster during CI.

If i understand this correctly, it is only for the more expensive larger GHA runners, not the default ones provided by GitHub. So we only need to migrate workflows that benefit from larger runners or VMs.

Yes, your understanding is correct. I indicated Trainer E2E, mostly.

@jeefy
Copy link
Author

jeefy commented Mar 18, 2025

If we move all CI jobs to CNCF hosted runner, we need to create DinD container images so that we can perform Kind cluster during CI.

DIND is already baked into the current setup. You can do docker builds (and some other jobs already are)

Need to debug why your e2e/kind cluster didn't spin up though.

@tenzen-y
Copy link
Member

If we move all CI jobs to CNCF hosted runner, we need to create DinD container images so that we can perform Kind cluster during CI.

DIND is already baked into the current setup. You can do docker builds (and some other jobs already are)

Need to debug why your e2e/kind cluster didn't spin up though.

Oh, I didn't know that. Thank you for letting us know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants