Skip to content

[OOB Performance] The performance impact caused by TORCHINDUCTOR_ONLINE_SOFTMAX #2650

@RUIJIEZHONG66166

Description

@RUIJIEZHONG66166

🐛 Describe the bug

When adding 'export TORCHINDUCTOR_ONLINE_SOFTMAX=0', performance of some models will improve on BMG linux. (PS: unit is ms)

Senario Model Batch Size 2.11.0.dev20251222+xpu 2.11.0.dev20251222+xpu + TORCHINDUCTOR_ONLINE_SOFTMAX=0
Inference hf_T5 16 467.9 409.29
Inference hf_T5_base 1 82.45 73.765
Inference hf_T5_generate 16 144.32 129.137
Inference OPTForCausalLM 16 387.27 336.864

Versions

Pytorch: 2.11.0.dev20251222+xpu
HW: BMG linux

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions