trtllm-build llama3.1-8b failed

trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_tp2 \
            --output_dir ./tmp/llama/7B/trt_engines/fp16/2-gpu/ \
            --context_fmha enable \
            --remove_input_padding enable\
            --gpus_per_node 8 \
            --gemm_plugin auto
---------------------------------------------------------------------------------------------------------------------------------------------------
[TRT] [E] IBuilder::buildSerializedNetwork: Error Code 4: Internal Error (Internal error: plugin node LLaMAForCausalLM/transformer/layers/0/attention/wrapper_L562/gpt_attention_L5483/PLUGIN_V2_GPTAttention_0 requires 210571452800 bytes of scratch space, but only 47697362944 is available. Try increasing the workspace size with IBuilderConfig::setMemoryPoolLimit().
)
---------------------------------------------------------------------------------------------------------------------------------------------------
I have 8 46GB GPUs, but this error occurs. Is this issue related to increasing the workspace size? How can I increase it?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

trtllm-build llama3.1-8b failed #2688

trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_tp2
--output_dir ./tmp/llama/7B/trt_engines/fp16/2-gpu/
--context_fmha enable
--remove_input_padding enable
--gpus_per_node 8
--gemm_plugin auto

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

trtllm-build llama3.1-8b failed #2688

Description

trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_tp2 --output_dir ./tmp/llama/7B/trt_engines/fp16/2-gpu/ --context_fmha enable --remove_input_padding enable --gpus_per_node 8 --gemm_plugin auto

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_tp2
--output_dir ./tmp/llama/7B/trt_engines/fp16/2-gpu/
--context_fmha enable
--remove_input_padding enable
--gpus_per_node 8
--gemm_plugin auto