Skip to content

Commit

Permalink
Update yolov8n tutorial with 16b manual configurations (#1156)
Browse files Browse the repository at this point in the history
* Update yolov8n tutorial with 16b manual configurations
  • Loading branch information
Idan-BenAmi authored Aug 15, 2024
1 parent 73f5fab commit 67ec854
Show file tree
Hide file tree
Showing 2 changed files with 87 additions and 54 deletions.
8 changes: 4 additions & 4 deletions tutorials/mct_model_garden/models_pytorch/yolov8/yolov8.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,10 +266,10 @@ def forward(self, x: Tensor) -> Tuple[Tensor, Tensor]:

# box decoding
lt, rb = dfl.chunk(2, 1)
y1 = self.relu1(self.anchors.unsqueeze(0)[:, 0, :] - lt[:, 0, :])
x1 = self.relu2(self.anchors.unsqueeze(0)[:, 1, :] - lt[:, 1, :])
y2 = self.relu3(self.anchors.unsqueeze(0)[:, 0, :] + rb[:, 0, :])
x2 = self.relu4(self.anchors.unsqueeze(0)[:, 1, :] + rb[:, 1, :])
y1 = self.anchors.unsqueeze(0)[:, 0, :] - lt[:, 0, :]
x1 = self.anchors.unsqueeze(0)[:, 1, :] - lt[:, 1, :]
y2 = self.anchors.unsqueeze(0)[:, 0, :] + rb[:, 0, :]
x2 = self.anchors.unsqueeze(0)[:, 1, :] + rb[:, 1, :]
y_bb = torch.stack((x1, y1, x2, y2), 1).transpose(1, 2)
return y_bb, y_cls

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,39 +67,35 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9728247bc20d0600",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import sys\n",
"import os\n",
"import importlib\n",
"\n",
"if not importlib.util.find_spec('model_compression_toolkit'):\n",
" !pip install model_compression_toolkit\n",
" !pip install mct-nightly\n",
"!git clone https://github.com/sony/model_optimization.git temp_mct && mv temp_mct/tutorials . && \\rm -rf temp_mct\n",
"sys.path.insert(0,\"tutorials\")"
]
],
"metadata": {
"collapsed": false
},
"id": "b6178c86a2df086"
},
{
"cell_type": "markdown",
"id": "7a1038b9fd98bba2",
"source": [
"### Download COCO evaluation set"
],
"metadata": {
"collapsed": false
},
"source": [
"### Download COCO evaluation set"
]
"id": "2addc3f2e6fbf402"
},
{
"cell_type": "code",
"execution_count": null,
"id": "8bea492d71b4060f",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"if not os.path.isdir('coco'):\n",
Expand All @@ -109,53 +105,61 @@
" !wget -nc http://images.cocodataset.org/zips/val2017.zip\n",
" !unzip -q -o val2017.zip -d ./coco\n",
" !echo Done loading val2017 images"
]
],
"metadata": {
"collapsed": false
},
"id": "4555a00ab957c2eb"
},
{
"cell_type": "markdown",
"id": "084c2b8b-3175-4d46-a18a-7c4d8b6fcb38",
"metadata": {},
"source": [
"## Model Quantization\n",
"\n",
"### Download a Pre-Trained Model \n",
"\n",
"We begin by loading a pre-trained [YOLOv8n](https://huggingface.co/SSI-DNN/pytorch_yolov8n_640x640_bb_decoding) model. This implementation is based on [Ultralytics](https://github.com/ultralytics/ultralytics) and includes a slightly modified version of yolov8 detection-head (mainly the box decoding part) that was adapted for model quantization. For further insights into the model's implementation details, please refer to [MCT Models Garden - yolov8](https://github.com/sony/model_optimization/tree/main/tutorials/mct_model_garden/models_pytorch/yolov8). "
]
],
"metadata": {
"collapsed": false
},
"id": "6a97125e471fae9"
},
{
"cell_type": "code",
"execution_count": null,
"id": "e8395b28-4732-4d18-b081-5d3bdf508691",
"metadata": {},
"outputs": [],
"source": [
"from tutorials.mct_model_garden.models_pytorch.yolov8.yolov8 import ModelPyTorch, yaml_load, model_predict\n",
"cfg_dict = yaml_load(\"tutorials/mct_model_garden/models_pytorch/yolov8/yolov8n.yaml\", append_filename=True) # model dict\n",
"model = ModelPyTorch.from_pretrained(\"SSI-DNN/pytorch_yolov8n_640x640_bb_decoding\", cfg=cfg_dict)"
]
],
"metadata": {
"collapsed": false
},
"id": "b7d057120847fc3c"
},
{
"cell_type": "markdown",
"id": "3cde2f8e-0642-4374-a1f4-df2775fe7767",
"metadata": {},
"source": [
"### Post training quantization using Model Compression Toolkit \n",
"\n",
"Now, we're all set to use MCT's post-training quantization. To begin, we'll define a representative dataset and proceed with the model quantization. Please note that, for demonstration purposes, we'll use the evaluation dataset as our representative dataset. We'll calibrate the model using 80 representative images, divided into 20 iterations of 'batch_size' images each. \n",
"\n",
"Additionally, to further compress the model's memory footprint, we will employ the mixed-precision quantization technique. This method allows each layer to be quantized with different precision options: 2, 4, and 8 bits, aligning with the imx500 target platform capabilities."
]
],
"metadata": {
"collapsed": false
},
"id": "6ecd174ab64a5ff3"
},
{
"cell_type": "code",
"execution_count": null,
"id": "56393342-cecf-4f64-b9ca-2f515c765942",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from model_compression_toolkit.core.common.network_editors import NodeNameScopeFilter\n",
"from model_compression_toolkit.core import BitWidthConfig\n",
"import model_compression_toolkit as mct\n",
"from tutorials.mct_model_garden.evaluation_metrics.coco_evaluation import coco_dataset_generator\n",
"from tutorials.mct_model_garden.models_pytorch.yolov8.yolov8_preprocess import yolov8_preprocess_chw_transpose\n",
Expand Down Expand Up @@ -193,51 +197,80 @@
"representative_dataset_gen = get_representative_dataset(n_iter=n_iters,\n",
" dataset_loader=representative_dataset)\n",
"\n",
"# Set IMX500-v1 TPC\n",
"# Set IMX500-v4 TPC (extended support in 16b operations)\n",
"tpc = mct.get_target_platform_capabilities(fw_name=\"pytorch\",\n",
" target_platform_name='imx500',\n",
" target_platform_version='v1')\n",
"\n",
"# Specify the necessary configuration for mixed precision quantization. To keep the tutorial brief, we'll use a small set of images and omit the hessian metric for mixed precision calculations. It's important to be aware that this choice may impact the resulting accuracy. \n",
"mp_config = mct.core.MixedPrecisionQuantizationConfig(num_of_images=5,\n",
" use_hessian_based_scores=False)\n",
"config = mct.core.CoreConfig(mixed_precision_config=mp_config,\n",
" quantization_config=mct.core.QuantizationConfig(shift_negative_activation_correction=True))\n",
"\n",
"# Define target Resource Utilization for mixed precision weights quantization (75% of 'standard' 8bits quantization)\n",
"resource_utilization_data = mct.core.pytorch_resource_utilization_data(in_model=model,\n",
" representative_data_gen=\n",
" representative_dataset_gen,\n",
" core_config=config,\n",
" target_platform_capabilities=tpc)\n",
"resource_utilization = mct.core.ResourceUtilization(weights_memory=resource_utilization_data.weights_memory * 0.75)\n",
" target_platform_version='v4')\n",
"\n",
"# Configure MCT manually for specific layers\n",
"manual_bit_cfg = BitWidthConfig()\n",
"manual_bit_cfg.set_manual_activation_bit_width(\n",
" [NodeNameScopeFilter('mul'),\n",
" NodeNameScopeFilter('sub'),\n",
" NodeNameScopeFilter('sub_1'),\n",
" NodeNameScopeFilter('add_6'),\n",
" NodeNameScopeFilter('add_7'),\n",
" NodeNameScopeFilter('stack')], 16)\n",
"\n",
"# Specify the necessary configuration for mixed precision quantization \n",
"config = mct.core.CoreConfig(mixed_precision_config=mct.core.MixedPrecisionQuantizationConfig(num_of_images=10),\n",
" quantization_config=mct.core.QuantizationConfig(concat_threshold_update=True),\n",
" bit_width_config=manual_bit_cfg)\n",
"\n",
"# Define target Resource Utilization for mixed precision weights quantization (76% of 'standard' 8bits quantization).\n",
"# We measure the number of parameters to be 3146176 and calculate the target memory (in Bytes).\n",
"resource_utilization = mct.core.ResourceUtilization(weights_memory=3146176 * 0.76)\n",
"\n",
"# Perform post training quantization\n",
"quant_model, _ = mct.ptq.pytorch_post_training_quantization(in_module=model,\n",
" representative_data_gen=\n",
" representative_dataset_gen,\n",
" representative_data_gen=representative_dataset_gen,\n",
" target_resource_utilization=resource_utilization,\n",
" core_config=config,\n",
" target_platform_capabilities=tpc)\n",
"print('Quantized model is ready')\n",
"\n",
"print('Quantized model is ready')"
],
"metadata": {
"collapsed": false
},
"id": "cec4035ce185614d"
},
{
"cell_type": "markdown",
"source": [
"### Postprocess integration\n",
"Integrate the postprocess to the model using NMS custom layer"
],
"metadata": {
"collapsed": false
},
"id": "5fb70430b48edb3"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# Wrapped the quantized model with PostProcess NMS.\n",
"from tutorials.mct_model_garden.models_pytorch.yolov8.yolov8 import PostProcessWrapper\n",
"from model_compression_toolkit.core.pytorch.pytorch_device_config import get_working_device\n",
"\n",
"# Define PostProcess params\n",
"score_threshold = 0.001\n",
"iou_threshold = 0.7\n",
"max_detections = 300\n",
"\n",
"# Get working device\n",
"from model_compression_toolkit.core.pytorch.pytorch_device_config import get_working_device\n",
"device = get_working_device()\n",
"\n",
"quant_model_pp = PostProcessWrapper(model=quant_model,\n",
" score_threshold=score_threshold,\n",
" iou_threshold=iou_threshold,\n",
" max_detections=max_detections).to(device=device)"
]
],
"metadata": {
"collapsed": false
},
"id": "86fe1368d9b501a9"
},
{
"cell_type": "markdown",
Expand Down

0 comments on commit 67ec854

Please sign in to comment.