-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do inferencing using multiple GPU's for styleformer #10
Comments
Leverage the "inference_on" parameter, and updated it to make it more intuitive now for multi-GPU usage. -1 is reserved for CPU and ```0 through 998 is reserved for GPUs``. The following snippet will get you the max number of visible CUDA devices you have. import torch
num_of_gpus = torch.cuda.device_count()
print(num_of_gpus) You just have to pass the range(num_of_gpus) i.e 0 to <your_max_devices> in inference_on. Behind the scenes I will be using the cuda devices as cuda:0, cuda:1 etc. up to num_of_gpus - 1. Just write a function to wrap Styleformer inference with device index as one of the params and invoke it using simple python multiprocessing. The number process can be equal to the number of devices. Each process will run Styleformer inference with the respective device index say P0 will run on CUDA:0, P1 will run on CUDA:1 & so on. Internally handle how you want to store the inference results. |
So for
|
@pratikchhapolika It sounds like you'll need to fire up a separate process for each GPU and pass in @PrithivirajDamodaran What I would like to know is how one can batchify Styleformer inference tasks to make efficient use of GPUs that have 48GB or 80GB each. |
@PrithivirajDamodaran please confirm on this? Will distributed training work here. Like this:
|
Yes, it can be batched. Will add that patch now. |
@PrithivirajDamodaran How's the batch patch coming along? |
I am using this model to do inferencing on 1 million data point using
A100 GPU's
with4 GPU
. I am launching ainference.py
code usingGoogles vertex-ai Container.
How can I make inference code to utilise all 4 GPU's ? So that inferencing is super-fast.
Here is the same code I use in inference.py:
The text was updated successfully, but these errors were encountered: