The benchmarks does not work on CPU because of AMP #9119

haifeng-jin · 2025-05-08T18:00:47Z

🐛 Bug

The benchmarks in pytorch/xla does not work on CPU because of it is set to use AMP by default.

To Reproduce

Steps to reproduce the behavior:

Follow the instruction to run the benchmarks in the README.md.
Or run this command directly

python xla/benchmarks/experiment_runner.py --dynamo=openxla --xla=PJRT --test=eval --test=train --suite-name=torchbench --accelerator=cpu --output-dirname=/tmp/output --repeat=1 --print-subprocess --no-resume --dump-pytorch-xla-metrics

Expected behavior

Add a new arg to the CLI --amp. Let user configure it (disable AMP) when run on CPU.

Environment

Reproducible on XLA backend [CPU/TPU/CUDA]: CPU
torch_xla version: master or 2.7

The text was updated successfully, but these errors were encountered:

haifeng-jin · 2025-05-08T19:00:54Z

Need input from @zpcore @ysiraichi before creating a pull request.

ysiraichi · 2025-05-19T13:53:56Z

The idea of hard-coding amp was to be directly comparable with PyTorch HUD. Question is: why doesn't it work? Could you post the error you are getting?

haifeng-jin · 2025-05-19T18:51:24Z

Just got back from my OOO.
Let me run this again and paste the results.

haifeng-jin · 2025-05-19T19:01:35Z

Here is the stack trace of the error:

Traceback (most recent call last):
  File "/workspaces/torch/pytorch/xla/benchmarks/experiment_runner.py", line 1060, in <module>
    main()
  File "/workspaces/torch/pytorch/xla/benchmarks/experiment_runner.py", line 1056, in main
    runner.run()
  File "/workspaces/torch/pytorch/xla/benchmarks/experiment_runner.py", line 67, in run
    self.run_single_config()
  File "/workspaces/torch/pytorch/xla/benchmarks/experiment_runner.py", line 293, in run_single_config
    model = self.model_loader.load_model(model_config, experiment)
  File "/workspaces/torch/pytorch/xla/benchmarks/benchmark_model.py", line 263, in load_model
    benchmark_model.set_up()
  File "/workspaces/torch/pytorch/xla/benchmarks/torchbench_model.py", line 263, in set_up
    self.autocast, self.autocast_kwargs = self._get_autocast_with_kwargs()
  File "/workspaces/torch/pytorch/xla/benchmarks/torchbench_model.py", line 435, in _get_autocast_with_kwargs
    raise RuntimeError(f"Tried to run {name} with AMP on {accelerator}. "
RuntimeError: Tried to run BERT_pytorch with AMP on cpu. However, AMP is only supported on cuda and tpu.

ysiraichi · 2025-05-22T11:39:55Z

Thank you for posting the error. One question, though: why do you want to run it on CPU?

haifeng-jin self-assigned this May 8, 2025

haifeng-jin added bug Something isn't working benchmarking labels May 8, 2025

haifeng-jin linked a pull request May 20, 2025 that will close this issue

Disable AMP by default on CPU #9218

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The benchmarks does not work on CPU because of AMP #9119

The benchmarks does not work on CPU because of AMP #9119

haifeng-jin commented May 8, 2025 •

edited

Loading

haifeng-jin commented May 8, 2025

ysiraichi commented May 19, 2025

haifeng-jin commented May 19, 2025

haifeng-jin commented May 19, 2025

ysiraichi commented May 22, 2025

The benchmarks does not work on CPU because of AMP #9119

The benchmarks does not work on CPU because of AMP #9119

Comments

haifeng-jin commented May 8, 2025 • edited Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

haifeng-jin commented May 8, 2025

ysiraichi commented May 19, 2025

haifeng-jin commented May 19, 2025

haifeng-jin commented May 19, 2025

ysiraichi commented May 22, 2025

haifeng-jin commented May 8, 2025 •

edited

Loading