You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I love OT but I'd like to run it in Cloud for much larger datasets. I know there's a CLI but I'm having a hell of a time getting it working. I'm trying to get it running on Runpod w/ Conda Py 3.10.9. I install all requirements and copy all older files over from a UI run to continue it, edit the json to repoint paths to the new runpod /workspace structure. I stage the files in the same directory structure- tried to run create_train_files but was getting a Module not found error. I did get it to run once but it's complained about sd 1.5 not being present although my json file was pointing directly to a local sdxl base model.
Has anyone gotten this working satisfactorily on a cloud GPU provider? If you have, could you share the special sauce?
Thanks!
What would you like to see as a solution?
A shiny new update to the CLI usage.
Have you considered alternatives? List them here.
I guess I could hack together an X server and push it through webrtc but that's a bit of work.
Or maybe X11 forwarding via SSH tunnel but I'm guessing that won't work with Runpod as they are like pseudo docker containers.
The text was updated successfully, but these errors were encountered:
Look at the ./run-cmd.sh and its usage instructions to see how to launch CLI training or any of the other CLI tools. :)
There's also someone working on RunPod cloud support within OneTrainer, which will make RunPod itself much easier to use. I described that in another ticket, in a comment chain starting here:
Describe your use-case.
Hi. I love OT but I'd like to run it in Cloud for much larger datasets. I know there's a CLI but I'm having a hell of a time getting it working. I'm trying to get it running on Runpod w/ Conda Py 3.10.9. I install all requirements and copy all older files over from a UI run to continue it, edit the json to repoint paths to the new runpod /workspace structure. I stage the files in the same directory structure- tried to run create_train_files but was getting a Module not found error. I did get it to run once but it's complained about sd 1.5 not being present although my json file was pointing directly to a local sdxl base model.
Has anyone gotten this working satisfactorily on a cloud GPU provider? If you have, could you share the special sauce?
Thanks!
What would you like to see as a solution?
A shiny new update to the CLI usage.
Have you considered alternatives? List them here.
I guess I could hack together an X server and push it through webrtc but that's a bit of work.
Or maybe X11 forwarding via SSH tunnel but I'm guessing that won't work with Runpod as they are like pseudo docker containers.
The text was updated successfully, but these errors were encountered: