Skip to content

graph: enable quantized gated mlp dispatch#4838

Open
TaoLv wants to merge 7 commits intomainfrom
lvtao/main/quantized-gated-mlp
Open

graph: enable quantized gated mlp dispatch#4838
TaoLv wants to merge 7 commits intomainfrom
lvtao/main/quantized-gated-mlp

Conversation

@TaoLv
Copy link
Contributor

@TaoLv TaoLv commented Mar 17, 2026

  1. The example and test files in benchdnn are updated with f32 intermediate data type. (Need Arch review.)
  2. Graph backend gated mlp kernel to support quantized inputs.
  3. Dispatch quantized gated mlp patterns to the quantized gated mlp primitive.

Still, the dispatching is disabled by default, to enable it, set _ONEDNN_GRAPH_GATED_MLP_FORCE_PRIMITIVE=0.

@TaoLv TaoLv requested review from a team as code owners March 17, 2026 03:49
@github-actions github-actions bot added component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch component:examples labels Mar 17, 2026
@TaoLv
Copy link
Contributor Author

TaoLv commented Mar 17, 2026

make test
disable benchdnn_all
enable benchdnn_graph

@TaoLv TaoLv force-pushed the lvtao/main/quantized-gated-mlp branch from e437944 to bd25ee1 Compare March 18, 2026 03:43
@TaoLv
Copy link
Contributor Author

TaoLv commented Mar 18, 2026

make test
disable benchdnn_all
enable benchdnn_graph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:examples component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants