This page provides details about mmdet2trt.
shape_ranges
is used to set the min/optimize/max shape of the input tensor. For each dimension in it, min<=optimize<=max. For example:
shape_ranges=dict(
x=dict(
min=[1,3,320,320],
opt=[1,3,800,1344],
max=[1,3,1344,1344],
)
)
trt_model = mmdet2trt( ...,
shape_ranges=shape_ranges, # set the opt shape
...)
This config will give you input tensor size between (320, 320) to (1344, 1344), max batch_size=4
Warning
Dynamic input shape and batch support might need more memory. Use fixed shape to avoid unnecessary memory usage(min=optimize=max).
fp16 mode can accelerate the inference. Set the fp16_mode=True
to enable it.
trt_model = mmdet2trt( ...,
fp16_mode=True, # enable fp16 mode
...)
- set
input8_mode=True
. - provide calibrate dataset, the
__getitem__()
method of dataset should return a list of tensor with shape (C,H,W), the shape must be the same asshape_range['x']['opt'][1:]
(optimize shape). The tensor should do the same preprocess as the model. There is a default dataset, you can also set your custom one. - set the calibrate algorithm, support
entropy
andminmax
.
from mmdet2trt import mmdet2trt, Int8CalibDataset
cfg_path="..." # MMDetection config path
model_path="..." # MMDetection checkpoint path
image_path_list = [...] # lists of image paths
shape_ranges=dict(
x=dict(
min=[...],
opt=[...],
max=[...],
)
)
calib_dataset = Int8CalibDataset(image_path_list, cfg_path, shape_ranges)
trt_model = mmdet2trt(cfg_path, model_path,
shape_ranges=shape_ranges,
int8_mode=True,
int8_calib_dataset=calib_dataset,
int8_calib_alg="entropy")
Warning
Not all models support int8 mode.
Some layers need extra GPU memory. Any some optimization tactics also need more space. Please enlarge max_workspace_size
may potentially accelerate your model with the cost of more memory.
The converted model is a python warp of TensorRT engine. first, get the serialized engine from trt_model:
with open(engine_path, mode='wb') as f:
f.write(model_trt.state_dict()['engine'])
Link the ${AMIRSTAN_PLUGIN_DIR}/build/lib/libamirstan_plugin.so
in your project (or you can load it in runtime). Compile and load the engine.
Warning
might need to invoke initLibAmirstanInferPlugins()
in amirInferPlugin.h to load the plugins.
The engine only contains inference forward. Preprocess(resize, normalize) and postprocess (divide scale factor) should be done in your project.
when converting model, set the output names:
trt_model = mmdet2trt( ...,
output_names=["num_detections", "boxes", "scores", "classes"], # output names
...)
Create engine file:
with open(engine_path, mode='wb') as f:
f.write(model_trt.state_dict()['engine'])
In the DeepStream model config file, set some config
[property]
...
net-scale-factor=0.0173 # compute from mean, std
offsets=123.675;116.28;103.53 # compute from mean, std
model-engine-file=trt.engine # the engine file created by mmdet2trt
labelfile-path=labels.txt # label file
...
In the same config file, set the plugin and parse function
[property]
...
parse-bbox-func-name=NvDsInferParseMmdet # parse funtion name(amirstan plugin buildin)
output-bbox-name=boxes # output name of the bounding box
output-blob-names=num_detections;boxes;scores;classes # output blob names, same as convert output_names
custom-lib-path=libamirstan_plugin.so # amirstan plugin lib path
...
You might also need to set group_threshold=0
, cause nvdsinfer
would try to cluster the detected objects generated by the parse function. Read A problem about parse-bbox-func-name for more detail. (Thanks @Paweł Pęczek for providing the technical details.)
[class-attrs-all]
...
group-threshold=0
...
Enjoy the model in DeepStream.
Warning: I am not so familiar with DeepStream. If you find anything wrong above, please let me know.
set flag enable_mask
to True
# enable mask
trt_model = mmdet2trt(... , enable_mask = True)
Note
The mask output is of shape [batch_size, num_boxes, 28, 28]
, the post-process of masks have not been included in the model. Please implement it by yourself if you want to integrate the converted engine into your own project.