Skip to content

Error loading Qwen3-32B Speculator #149

@jiwonsong-dev

Description

@jiwonsong-dev

When I try to launch Qwen3-32B with Eagle-3 Speculator, the engine fails to lauch with Runtimerror like below.
I am using vLLM 0.11.0 and speculators 0.1.0.

Command is like:
vllm serve RedHatAI/Qwen3-32B-speculator.eagle3 \ --dtype auto \ -tp 2 \ --max_model_len 36000 \ --gpu-memory-utilization 0.7 \ --enable-prefix-caching \ --dtype bfloat16 \ --port 30000

But same code works without any problem for RedHatAI/Qwen3-8B-speculator.eagle3

And the error message is:
(Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] WorkerProc hit an exception. (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Traceback (most recent call last): (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 666, in worker_busy_loop (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model_runner.profile_run() (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3392, in profile_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] = self._dummy_run(self.max_num_tokens, is_profile=True) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3167, in _dummy_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.drafter.dummy_run(num_tokens) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 919, in dummy_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model( (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 241, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.model(input_ids, positions, hidden_states) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 310, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = self.compiled_callable(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 736, in compile_wrapper (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 143, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] def forward( (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 375, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super().__call__(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._wrapped_call(self, *args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 424, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] raise e (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 411, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "<eval_with_key>.135", line 18, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] submod_0 = self.submod_0(l_input_ids_, s72, l_self_modules_embed_tokens_parameters_weight_, l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_, l_hidden_states_, l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_, l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_, s47); l_input_ids_ = l_self_modules_embed_tokens_parameters_weight_ = l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_ = l_hidden_states_ = l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_ = l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_ = l_positions_ = l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_ = s47 = None (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_graph.py", line 121, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.runnable(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_piecewise_backend.py", line 90, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.compiled_graph_for_general_shape(*args) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/compiler_interface.py", line 518, in compiled_graph (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] graph_output = inductor_compiled_graph(list_args) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_inductor/output_code.py", line 584, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.current_callable(inputs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.cache/vllm/torch_compile_cache/066401215d/rank_1_0/inductor_cache/vk/cvkszagsr672mug5ywgzm3f3i35o4a3tdeej5zn5hv5b2mqzwjfq.py", line 403, in call (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] assert_size_stride(arg6_1, (3200, 10240), (10240, 1)) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] AssertionError: expected size 5120==3200, stride 10240==10240 at dim=0 (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] This error most often comes from a incorrect fake (aka meta) kernel for a custom op. (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Use torch.library.opcheck to test your custom op. (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] See https://pytorch.org/docs/stable/library.html#torch.library.opcheck (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Traceback (most recent call last): (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 666, in worker_busy_loop (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model_runner.profile_run() (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3392, in profile_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] = self._dummy_run(self.max_num_tokens, is_profile=True) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3167, in _dummy_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.drafter.dummy_run(num_tokens) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 919, in dummy_run (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model( (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 241, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.model(input_ids, positions, hidden_states) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 310, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = self.compiled_callable(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 736, in compile_wrapper (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 143, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] def forward( (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 375, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super().__call__(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._wrapped_call(self, *args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 424, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] raise e (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 411, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "<eval_with_key>.135", line 18, in forward (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] submod_0 = self.submod_0(l_input_ids_, s72, l_self_modules_embed_tokens_parameters_weight_, l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_, l_hidden_states_, l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_, l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_, s47); l_input_ids_ = l_self_modules_embed_tokens_parameters_weight_ = l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_ = l_hidden_states_ = l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_ = l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_ = l_positions_ = l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_ = s47 = None (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_graph.py", line 121, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.runnable(*args, **kwargs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_piecewise_backend.py", line 90, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.compiled_graph_for_general_shape(*args) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/compiler_interface.py", line 518, in compiled_graph (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] graph_output = inductor_compiled_graph(list_args) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_inductor/output_code.py", line 584, in __call__ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.current_callable(inputs) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.cache/vllm/torch_compile_cache/066401215d/rank_1_0/inductor_cache/vk/cvkszagsr672mug5ywgzm3f3i35o4a3tdeej5zn5hv5b2mqzwjfq.py", line 403, in call (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] assert_size_stride(arg6_1, (3200, 10240), (10240, 1)) (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] AssertionError: expected size 5120==3200, stride 10240==10240 at dim=0 (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] This error most often comes from a incorrect fake (aka meta) kernel for a custom op. (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Use torch.library.opcheck to test your custom op. (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] See https://pytorch.org/docs/stable/library.html#torch.library.opcheck (Worker_TP1 pid=134175) ERROR 10-13 21:41:36 [multiproc_executor.py:671] (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] WorkerProc hit an exception. (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Traceback (most recent call last): (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 666, in worker_busy_loop (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model_runner.profile_run() (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3392, in profile_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] = self._dummy_run(self.max_num_tokens, is_profile=True) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3167, in _dummy_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.drafter.dummy_run(num_tokens) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 919, in dummy_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model( (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 241, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.model(input_ids, positions, hidden_states) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 310, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = self.compiled_callable(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 736, in compile_wrapper (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 143, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] def forward( (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 375, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super().__call__(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._wrapped_call(self, *args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 424, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] raise e (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 411, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "<eval_with_key>.135", line 18, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] submod_0 = self.submod_0(l_input_ids_, s72, l_self_modules_embed_tokens_parameters_weight_, l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_, l_hidden_states_, l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_, l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_, s47); l_input_ids_ = l_self_modules_embed_tokens_parameters_weight_ = l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_ = l_hidden_states_ = l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_ = l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_ = l_positions_ = l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_ = s47 = None (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_graph.py", line 121, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.runnable(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_piecewise_backend.py", line 90, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.compiled_graph_for_general_shape(*args) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/compiler_interface.py", line 518, in compiled_graph (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] graph_output = inductor_compiled_graph(list_args) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_inductor/output_code.py", line 584, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.current_callable(inputs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.cache/vllm/torch_compile_cache/066401215d/rank_0_0/inductor_cache/4a/c4a7q6n7kicbnzkndzsfambewbtgq5e3lc5265lnnkhzll7oyu4j.py", line 398, in call (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] assert_size_stride(arg6_1, (3200, 10240), (10240, 1)) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] AssertionError: expected size 5120==3200, stride 10240==10240 at dim=0 (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] This error most often comes from a incorrect fake (aka meta) kernel for a custom op. (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Use torch.library.opcheck to test your custom op. (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] See https://pytorch.org/docs/stable/library.html#torch.library.opcheck (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Traceback (most recent call last): (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 666, in worker_busy_loop (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in determine_available_memory (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model_runner.profile_run() (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3392, in profile_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] = self._dummy_run(self.max_num_tokens, is_profile=True) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3167, in _dummy_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.drafter.dummy_run(num_tokens) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return func(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 919, in dummy_run (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] self.model( (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 241, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.model(input_ids, positions, hidden_states) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 310, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] output = self.compiled_callable(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 736, in compile_wrapper (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/model_executor/models/llama_eagle3.py", line 143, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] def forward( (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 375, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super().__call__(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return fn(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._wrapped_call(self, *args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 424, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] raise e (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/fx/graph_module.py", line 411, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc] (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self._call_impl(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return forward_call(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "<eval_with_key>.135", line 18, in forward (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] submod_0 = self.submod_0(l_input_ids_, s72, l_self_modules_embed_tokens_parameters_weight_, l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_, l_hidden_states_, l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_, l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_, l_positions_, l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_, s47); l_input_ids_ = l_self_modules_embed_tokens_parameters_weight_ = l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_ = l_hidden_states_ = l_self_modules_layers_modules_0_modules_hidden_norm_parameters_weight_ = l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_ = l_positions_ = l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_ = s47 = None (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_graph.py", line 121, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.runnable(*args, **kwargs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/cuda_piecewise_backend.py", line 90, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.compiled_graph_for_general_shape(*args) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/vllm/compilation/compiler_interface.py", line 518, in compiled_graph (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] graph_output = inductor_compiled_graph(list_args) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.conda/envs/moe/lib/python3.12/site-packages/torch/_inductor/output_code.py", line 584, in __call__ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] return self.current_callable(inputs) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] File "/home/jiwonsong/.cache/vllm/torch_compile_cache/066401215d/rank_0_0/inductor_cache/4a/c4a7q6n7kicbnzkndzsfambewbtgq5e3lc5265lnnkhzll7oyu4j.py", line 398, in call (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] assert_size_stride(arg6_1, (3200, 10240), (10240, 1)) (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] AssertionError: expected size 5120==3200, stride 10240==10240 at dim=0 (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] This error most often comes from a incorrect fake (aka meta) kernel for a custom op. (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] Use torch.library.opcheck to test your custom op. (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] See https://pytorch.org/docs/stable/library.html#torch.library.opcheck (Worker_TP0 pid=134174) ERROR 10-13 21:41:36 [multiproc_executor.py:671] ...

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions