Skip to content

[BMG][OOB] t5 inference performance drop 2 #2654

@jianyizh

Description

@jianyizh

🐛 Describe the bug

We changed triton heuristic and caused perf drop. We should ask triton to fix perf (not use SLM)

model bs perf before perf after
T5ForConditionalGeneration 32 292 435
T5Small 32 292 435

Versions

b580

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions