-
Notifications
You must be signed in to change notification settings - Fork 808
Closed
Labels
Description
PyTorch reproducer:
python -c "import torch;torch.nn.ConvTranspose1d(2016, 1026, 1024, stride=256)(torch.rand(1, 2016, 224))"
Standalone oneDNN (v3.7.1) with ACL (v25.02) reproducer (with benchdnn):
ONEDNN_VERBOSE=profile,profile_externals ./tests/benchdnn/benchdnn --deconv mb1_ic2016oc1026_ih1oh1kh1sh1dh0ph0_iw224ow58112kw1024sw256dw0pw0
Root cause:
The root cause in ACL is a write to an invalid address on this line
The address we write to is calculated using this ptr_to_element function which calculates the offset in offset_element_in_bytes. The offset in offset_element_in_bytes is int32_t which overflows in this case because of the massive number of parameters in the problem 2016 * 1026 * 1024 (i.e. returning an offset of -2139758464).
A smaller workload like torch.nn.ConvTranspose1d(500, 1026, 1024, stride=256)(torch.rand(1, 500, 224)) doesn't crash.
Suggested fixes:
- use 64 bits for the offset (and the return type) in
offset_element_in_bytes - OR make ACL fail the validation stage if the problem is too big.
Full Stack Trace: See pytorch/pytorch#165654