Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update stable diffusion benchmark for TensorRT EP (microsoft#16560)
### Description Add Stable Diffusion Text2Image pipelines of TensorRT EP and CUDA EP. They can automatically export and optimize ONNX model, and create ONNXRuntime session to use TensorRT EP or CUDA execution provider. Add support for benchmarking TensorRT. Add support of cuda graph. The feature is only supported in nightly package right now. Engine/Provider to test | command line ---- | --- CUDA EP | `python benchmark.py -v 1.5` CUDA EP with cuda graph | `python benchmark.py -v 1.5 --enable_cuda_graph` TensorRT EP | `python benchmark.py -v 1.5 -r tensorrt` TensorRT EP with cuda graph | `python benchmark.py -v 1.5 -r tensorrt --enable_cuda_graph` TensorRT | `python benchmark.py -v 1.5 -e tensorrt` Add benchmark numbers of T4 GPU using CUDA 11.7, cuDNN 8.5, PyTorch 1.13.1+cu11.7, TensorRT 8.6.1, onnxruntime-gpu 1.15.1 (or ort-nightly-gpu 1.16 for cuda graph). TODO: add benchmark numbers of A100-80GB ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
- Loading branch information