-
Notifications
You must be signed in to change notification settings - Fork 70
Description
Is your feature request related to a problem? Please describe.
The -nvidia image includes onnxruntime-gpu 1.19.2 which reports TensorrtExecutionProvider as available, but it can't actually load because libnvinfer.so.10 isn't in the image:
Failed to load library libonnxruntime_providers_tensorrt.so with error: libnvinfer.so.10: cannot open shared object file Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
All ONNX inference (MusiCNN, CLAP, MuQ-MuLan) falls back to the CUDA provider. TensorRT could give a boost to processing speed.
Describe the solution you'd like
Include the TensorRT runtime libraries, libnvinfer10 libnvinfer-plugin10, in the -nvidia Dockerfile so onnxruntime-gpu can use TensorrtExecutionProvider without any extra setup. Possibly adding a USE_TENSORRT env var.
Describe alternatives you've considered
I thought about manually mounting TensorRT libs into the container...
Additional context
- Image: ghcr.io/neptunehub/audiomuse-ai:latest-nvidia
- GPU: NVIDIA RTX 4080 SUPER (16GB VRAM)
- onnxruntime-gpu: 1.19.2
- Available providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
- Active providers (after fallback): ['CUDAExecutionProvider', 'CPUExecutionProvider']
- Maybe TensorRT is lower quality, or not worth the extra space, or too annoying to work with, in which case... nevermind!! I'm just exploring ideas for potential analysis speed boosts. I definitely prefer analysis quality over performance though.