🐛 Describe the bug
eca_halonext26ts train with bs=128 performance regression from 277to 373 due to pytorch/pytorch@4827c16 It uses max_autotune_pointwise instead of autotune_pointwise and thus removed some triton config candidates in pointwise
Versions
b580