Skip to content

[Bug] Multi-GPU OOM: Missing device isolation in inference_treebench.py #10

@wmz422

Description

@wmz422

I encountered an OOM error when running inference_treebench.py on a multi-GPU machine.
The script uses multiprocessing with device_map="auto" , but it does not assign specific GPUs to each worker process. As a result, all spawned processes attempt to load the model onto the same GPU (usually GPU 0), causing OOM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions