Skip to content

[FEATURE] Add TensorRT support to the nvidia image? #353

@jaredtrent

Description

@jaredtrent

Is your feature request related to a problem? Please describe.

The -nvidia image includes onnxruntime-gpu 1.19.2 which reports TensorrtExecutionProvider as available, but it can't actually load because libnvinfer.so.10 isn't in the image:

Failed to load library libonnxruntime_providers_tensorrt.so with error: libnvinfer.so.10: cannot open shared object file Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

All ONNX inference (MusiCNN, CLAP, MuQ-MuLan) falls back to the CUDA provider. TensorRT could give a boost to processing speed.

Describe the solution you'd like

Include the TensorRT runtime libraries, libnvinfer10 libnvinfer-plugin10, in the -nvidia Dockerfile so onnxruntime-gpu can use TensorrtExecutionProvider without any extra setup. Possibly adding a USE_TENSORRT env var.

Describe alternatives you've considered

I thought about manually mounting TensorRT libs into the container...

Additional context

  • Image: ghcr.io/neptunehub/audiomuse-ai:latest-nvidia
  • GPU: NVIDIA RTX 4080 SUPER (16GB VRAM)
  • onnxruntime-gpu: 1.19.2
  • Available providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
  • Active providers (after fallback): ['CUDAExecutionProvider', 'CPUExecutionProvider']
  • Maybe TensorRT is lower quality, or not worth the extra space, or too annoying to work with, in which case... nevermind!! I'm just exploring ideas for potential analysis speed boosts. I definitely prefer analysis quality over performance though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions