You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to inference model VILA 3b, but i meet this errror when i run:
python3 scripts/launch_triton_server.py --world_size 1 --model_repo=multimodal_ifb/ --tensorrt_llm_model_name tensorrt_llm,multimodal_encoders --multimodal_gpu0_cuda_mem_pool_bytes 300000000
I using
container: nvcr.io/nvidia/tritonserver:24.11-trtllm-python-py3
transformers 4.43.4
tensorrt_llm 0.15.0
The text was updated successfully, but these errors were encountered:
I try to inference model VILA 3b, but i meet this errror when i run:
python3 scripts/launch_triton_server.py --world_size 1 --model_repo=multimodal_ifb/ --tensorrt_llm_model_name tensorrt_llm,multimodal_encoders --multimodal_gpu0_cuda_mem_pool_bytes 300000000
I using
container: nvcr.io/nvidia/tritonserver:24.11-trtllm-python-py3
transformers 4.43.4
tensorrt_llm 0.15.0
The text was updated successfully, but these errors were encountered: