Skip to content

[distributed] AttributeError: module 'torch.xpu' has no attribute '_sleep' #1727

@PenghuiCheng

Description

@PenghuiCheng

🚀 The feature, motivation and pitch

AttributeError: module 'torch.xpu' has no attribute '_sleep', need support _sleep for XPU device.

reproduce step:
pytest -vs _composable/fsdp/test_fully_shard_training.py -k test_non_root_forward_backward

error message:
Traceback (most recent call last):
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 643, in wrapper
self._join_processes(fn)
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 907, in _join_processes
self._check_return_codes(fn, elapsed_time)
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 947, in _check_return_codes
raise RuntimeError(error)
RuntimeError: Process 3 exited with error code 10 and exception:
Traceback (most recent call last):
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 791, in run_test
getattr(self, test_name)()
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 645, in wrapper
fn()
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_utils.py", line 3148, in wrapper
method(*args, **kwargs)
File "/home/sdp/penghuic/pytorch/torch/testing/_internal/common_distributed.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/sdp/penghuic/pytorch/test/distributed/_composable/fsdp/test_fully_shard_training.py", line 521, in test_non_root_forward_backward
torch.get_device_module(device_type)._sleep(int(100 * get_cycles_per_ms()))
AttributeError: module 'torch.xpu' has no attribute '_sleep'

To execute this test, run the following from the base repo dir:
python test/distributed/_composable/fsdp/test_fully_shard_training.py TestFullyShard1DTrainingCore.test_non_root_forward_backward

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

version:

env_weekly.txt

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions