e2e-detection

e2e-detection is a toolkit to help deep learning engineers test their PyTorch/TensorFlow models on NVIDIA Triton inference server using different inference engines.

Let's introduce the three-stage deployment pipeline. At first, deep learning scientists train their models through deep learning frameworks (TensorFlow/Pytorch). Second, trained models will be converted to inference-optimized formats (ONNX/TensorRT/OpenPPL/NCNN/MNN). Finally, the converted models will be deployed to Nvidia Triton server. We usually call Triton inference server and others inference engines because Triton is responsible for managing resources for n models using different engines. In Triton, inference engines also call backends.

Why do we need e2e-detection?

Too many engineering efforts in three-stage deployment.
It is easy to meet dependency errors during deployment.

What can we do?

A Dockerfile to build all testing environments automatically.
Two shell scripts to convert and configure trained models automatically.
A use case of real-world deployment.

As a deep learning engineer, I highly recommend you use pre-trained models from SenseTime-MMLab because the team is extremely active to develop advanced deep learning models for diverse tasks in video analytics (i.e., image classification , object detection , semantic segmentation , text detection , 3d object detection , pose estimation and video understanding based on action ).

Tutorials

Test Pytorch Models on Nvidia Triton server using TensorRT inference engine: Faster RCNN, YOLOv3, DETR, Swin-Transformer.
Test TensorFlow Models on Nvidia Triton server using TensorRT inference engine: EfficientDet-Dx.
A use case of Triton inference server

Inference Engines

Inference Engine	Support	Stable	Target Platform
Nvidia TensorRT	✔️	✔️	Nvidia GPU
SenseTime-MMLab OpenPPL			CPU/GPU/Mobile
Tencent NCNN		✔️	Mobile CPU
Alibaba MNN	TBD	✔️	Mobile CPU/GPU/NPU

Applications

TODO

Nvidia Triton
- send/receive http requests
- parse the results
- run the pipeline with a docker
Pytorch
- converse PyTorch models
- test it with the inference engine
- deploy the optimized model on Triton
TensorFlow
- converse PyTorch models
- test it with the inference engine
- deploy the optimized model on Triton

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
media		media
mmdeploy		mmdeploy
out/media		out/media
temp		temp
tensorrt_tf		tensorrt_tf
README.md		README.md
clear.py		clear.py
tensorrt_convert.sh		tensorrt_convert.sh
triton_pytorch.md		triton_pytorch.md
triton_tensorflow.md		triton_tensorflow.md
triton_tensorrt.md		triton_tensorrt.md
use_case.md		use_case.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

e2e-detection

Why do we need e2e-detection?

What can we do?

Tutorials

Inference Engines

Applications

TODO

References

About

Releases

Packages

Languages

efficient-edge/e2e-detection

Folders and files

Latest commit

History

Repository files navigation

e2e-detection

Why do we need e2e-detection?

What can we do?

Tutorials

Inference Engines

Applications

TODO

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages