[OOB Performance] The performance impact caused by TORCHINDUCTOR_ONLINE_SOFTMAX

### 🐛 Describe the bug
When adding 'export TORCHINDUCTOR_ONLINE_SOFTMAX=0', performance of some models will improve on BMG linux. (PS: unit is ms)
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/zhongrui/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/zhongrui/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">

</head>

<body link="#467886" vlink="#96607D">


Senario | Model | Batch Size | 2.11.0.dev20251222+xpu | 2.11.0.dev20251222+xpu   + TORCHINDUCTOR_ONLINE_SOFTMAX=0
-- | -- | -- | -- | --
Inference | hf_T5 | 16 | 467.9 | 409.29
Inference | hf_T5_base | 1 | 82.45 | 73.765
Inference | hf_T5_generate | 16 | 144.32 | 129.137
Inference | OPTForCausalLM | 16 | 387.27 | 336.864



</body>

</html>


### Versions

Pytorch: 2.11.0.dev20251222+xpu
HW: BMG linux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OOB Performance] The performance impact caused by TORCHINDUCTOR_ONLINE_SOFTMAX #2650

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Senario	Model	Batch Size	2.11.0.dev20251222+xpu	2.11.0.dev20251222+xpu + TORCHINDUCTOR_ONLINE_SOFTMAX=0
Inference	hf_T5	16	467.9	409.29
Inference	hf_T5_base	1	82.45	73.765
Inference	hf_T5_generate	16	144.32	129.137
Inference	OPTForCausalLM	16	387.27	336.864

[OOB Performance] The performance impact caused by TORCHINDUCTOR_ONLINE_SOFTMAX #2650

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions