Inference VILA 3b #666

anhnhust · 2024-12-22T04:15:59Z

I try to inference model VILA 3b, but i meet this errror when i run:
python3 scripts/launch_triton_server.py --world_size 1 --model_repo=multimodal_ifb/ --tensorrt_llm_model_name tensorrt_llm,multimodal_encoders --multimodal_gpu0_cuda_mem_pool_bytes 300000000

I using
container: nvcr.io/nvidia/tritonserver:24.11-trtllm-python-py3
transformers 4.43.4
tensorrt_llm 0.15.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference VILA 3b #666

Inference VILA 3b #666

anhnhust commented Dec 22, 2024 •

edited

Loading

Inference VILA 3b #666

Inference VILA 3b #666

Comments

anhnhust commented Dec 22, 2024 • edited Loading

anhnhust commented Dec 22, 2024 •

edited

Loading