Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User guide for PyTorch Training #2543

Open
andreyvelich opened this issue Mar 18, 2025 · 3 comments · May be fixed by kubeflow/website#4053
Open

User guide for PyTorch Training #2543

andreyvelich opened this issue Mar 18, 2025 · 3 comments · May be fixed by kubeflow/website#4053

Comments

@andreyvelich
Copy link
Member

We should create user guide for PyTorch training.

Initially, it could be a simple guide explaining how to get available runtimes, how to configure train() API, and use the Kubeflow Trainer client APIs to fetch TrainJob results.

Feel free to propose more guides that we should create for Kubeflow Trainer V2 @varodrig @hbelmiro @franciscojavierarceo!

/good-first-issue
/area docs

Copy link

@andreyvelich:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

We should create user guide for PyTorch training.

Initially, it could be a simple guide explaining how to get available runtimes, how to configure train() API, and use the Kubeflow Trainer client APIs to fetch TrainJob results.

Feel free to propose more guides that we should create for Kubeflow Trainer V2 @varodrig @hbelmiro @franciscojavierarceo!

/good-first-issue
/area docs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@izuku-sds
Copy link

Hello @andreyvelich, I would like work on this issue. However, I'm not fully familiar with these concepts, could you provide resources which I can refer for this.

@andreyvelich
Copy link
Member Author

Thank you for your interest, sure @izuku-sds!
Please start to explore our TrainerClient APIs, to understand what APIs users can use while developing PyTorch models: https://github.com/kubeflow/trainer/blob/master/sdk/kubeflow/trainer/api/trainer_client.py
/assign @izuku-sds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants