Skip to content

Commit 32c4ddc

Browse files
wuxun-zhangxinyu-intel
authored andcommitted
make sure num tokens divisible by tp_size
Signed-off-by: Wuxun Zhang <[email protected]>
1 parent 2842c59 commit 32c4ddc

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm_gaudi/v1/worker/hpu_dp_utils.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,11 @@ def make(
2222
dp_size = vllm_config.parallel_config.data_parallel_size
2323
tp_size = vllm_config.parallel_config.tensor_parallel_size
2424

25+
if num_tokens % tp_size != 0:
26+
# make sure num_tokens is enough to be divided by tp_size for
27+
# sequence parallel MOE
28+
num_tokens = (num_tokens // tp_size + 1) * tp_size
29+
2530
num_tokens_across_dp = num_tokens * dp_size
2631

2732
dtype = vllm_config.model_config.dtype

0 commit comments

Comments
 (0)