🐛[BUG]: Cannot detect all the GPUs #752

david5010 · 2025-01-08T20:33:18Z

Version

0.6.0

On which installation method(s) does this occur?

No response

Describe the issue

I'm currently using Earth2Mip on Docker with the modulus image. On my PC, I have 2 GPUs and Pytorch can detect them without any issues. I want to speed up the Ensemble Forecast by leveraging the 2 GPUs rather than using just 1. However, when I checked, Modulus isn't detecting both GPUs, just the first one. Is there any fixes I can do so that it'll detect both?

DistributedManager.initialize()
    device = DistributedManager().device
    group = torch.distributed.group.WORLD

    logging.info(f"Earth-2 MIP config loaded {config}")
    logging.info(f"Loading model onto device {device}")
    model = get_model(config.weather_model, device=device)
    logging.info("Constructing initializer data source")
    perturb = get_initializer(
        model,
        config,
    )
    logging.info("Running inference")
    run_inference(model, config, perturb, group)

Minimum reproducible example

Relevant log output

Environment details

coreyjadams · 2025-01-14T19:56:30Z

Hi @david5010 ,

How are you launching the code? Can you share the launch command and any errors you see? And, can you share what you are expecting to see when you call run_inference?

david5010 · 2025-01-21T14:34:44Z

Hi,

I realized that it was a mistake on my end. I was expecting to see both GPUs being utilized but I see that I simply had to use torchrun instead of python to make it work

coreyjadams · 2025-01-21T14:37:56Z

Glad you got it working!

david5010 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Jan 8, 2025

david5010 closed this as completed Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛[BUG]: Cannot detect all the GPUs #752

🐛[BUG]: Cannot detect all the GPUs #752

david5010 commented Jan 8, 2025 •

edited

Loading

coreyjadams commented Jan 14, 2025

david5010 commented Jan 21, 2025

coreyjadams commented Jan 21, 2025

🐛[BUG]: Cannot detect all the GPUs #752

🐛[BUG]: Cannot detect all the GPUs #752

Comments

david5010 commented Jan 8, 2025 • edited Loading

Version

On which installation method(s) does this occur?

Describe the issue

Minimum reproducible example

Relevant log output

Environment details

coreyjadams commented Jan 14, 2025

david5010 commented Jan 21, 2025

coreyjadams commented Jan 21, 2025

david5010 commented Jan 8, 2025 •

edited

Loading