Skip to content

Commit 1e9425f

Browse files
jianyuhfacebook-github-bot
authored andcommitted
Add Llama4 shapes in quantize_bench (#4129)
Summary: X-link: facebookresearch/FBGEMM#1210 Pull Request resolved: #4129 Reviewed By: jiawenliu64 Differential Revision: D74788497 fbshipit-source-id: 3fde36ea0e3dc78b65e3f97d44d05be58e5b8938
1 parent 127848a commit 1e9425f

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

fbgemm_gpu/experimental/gen_ai/bench/quantize_bench.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,20 +72,27 @@ def get_llama_shapes() -> List[Tuple[int, int, int, int]]:
7272

7373
llama_shapes = []
7474
for M in [1, 16, 32, 64, 96, 128, 16384]:
75-
# Add shapes for llama 70B
75+
# Add shapes for llama3 70B
7676
llama_shapes += [
7777
(1, M, 1280, 8192),
7878
(1, M, 8192, 1024),
7979
(1, M, 7168, 8192),
8080
(1, M, 8192, 3584),
8181
]
82-
# Add shapes for llama 405B
82+
# Add shapes for llama3 405B
8383
llama_shapes += [
8484
(1, M, 13312, 6656),
8585
(1, M, 13312, 16384),
8686
(1, M, 16384, 6656),
8787
(1, M, 16384, 16384),
8888
]
89+
# Add shapes for llama4 Scout/Maverick (17Bx{16,128})
90+
llama_shapes += [
91+
(1, M, 896, 5120),
92+
(1, M, 5120, 640),
93+
(1, M, 2048, 5120),
94+
(1, M, 5120, 1024),
95+
]
8996

9097
return llama_shapes
9198

0 commit comments

Comments
 (0)