Description
Testing the Qwen2.5 VL-3B model using TRTLLM version 0.19.0, following the PyTorch workflow example
https://github.com/NVIDIA/TensorRT-LLM/blob/release/0.19/examples/pytorch/quickstart_multimodal.py
, running with the use_cuda_graph parameter resulted in only a few generated tokens. Removing the use_cuda_graph parameter produced normal output with over 100 tokens, while all other configuration parameters remained the same. Later, running the same test on TRTLLM 0.20.0 yielded the same results.
python3 quickstart_multimodal.py \ --model_dir /qwen/tmp/hf_models/Qwen2.5-VL-3B-Instruct \ --modality image \ --max_batch_size 1 \ --max_num_tokens 4096 \ --attention_backend TRTLLM \ --prompt "Please describe this image in at least 100 characters, covering main elements and their relationships." \ --media "/qwen/pics/demo.jpeg" \ --max_tokens 128 \ --use_cuda_graph \ 2>&1 | tee run_qwen2.5_vl_3B_cuda_graph.log
[0] Prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n<|vision_start|><|image_pad|><|vision_end|>Please describe this image in at least 100 characters, covering main elements and their relationships.<|im_end|>\n<|im_start|>assistant\n', Generated text: 'A'
python3 quickstart_multimodal.py \ --model_dir /qwen/tmp/hf_models/Qwen2.5-VL-3B-Instruct \ --modality image \ --max_batch_size 1 \ --max_num_tokens 4096 \ --attention_backend TRTLLM \ --prompt "Please describe this image in at least 100 characters, covering main elements and their relationships." \ --media "/qwen/pics/demo.jpeg" \ --max_tokens 128 \ 2>&1 | tee run_qwen2.5_vl_3B_without_cuda_graph.log
[0] Prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n<|vision_start|><|image_pad|><|vision_end|>Please describe this image in at least 100 characters, covering main elements and their relationships.<|im_end|>\n<|im_start|>assistant\n', Generated text: 'A woman and her golden retriever dog are sitting on a sandy beach, with the ocean in the background. The woman is wearing a plaid shirt and black pants, while the dog is wearing a harness. They are both smiling and interacting with each other.'