forked from openvinotoolkit/open_model_zoo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add YOLO and tiny YOLO v2 models w/ sparsities.
Add descriptions for all three models. Add model link. Update yolo-tiny-v2-ava-0001.md Update yolo-tiny-v2-ava-0001.md Update yolo-tiny-v2-ava-0001.md Update yolo-tiny-v2-ava-sparse-30-0001.md Update yolo-tiny-v2-ava-sparse-30-0001.md Update yolo-tiny-v2-ava-sparse-30-0001.md Update yolo-tiny-v2-ava-sparse-60-0001.md Update yolo-tiny-v2-ava-0001.md Update yolo-tiny-v2-ava-0001.md Add sparse YOLO v2 models. Update models/intel/yolo-tiny-v2-ava-0001/description/yolo-tiny-v2-ava-0001.md Co-Authored-By: Roman Donchenko <[email protected]> add FP16 configs fix models readme names
- Loading branch information
Showing
13 changed files
with
439 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
.../yolo-tiny-v2-ava-sparse-30-0001/description/yolo-tiny-v2-ava-sparse-30-0001.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# yolo-tiny-v2-ava-sparse-30-0001 | ||
|
||
## Use Case and High-Level Description | ||
|
||
This is a re-implemented and re-trained version of [tiny YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset. | ||
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (30% of network parameters are set to zeros). | ||
|
||
## Example | ||
|
||
## Specification | ||
|
||
| Metric | Value | | ||
|---------------------------------|-------------------------------------------| | ||
| Mean Average Precision (mAP) | 36.33% | | ||
| Flops | 6.97Bn* | | ||
| Source framework | Tensorflow** | | ||
|
||
Average Precision metric described in: Mark Everingham et al. | ||
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf). | ||
Tested on VOC 2012 validation dataset. | ||
|
||
## Performance | ||
|
||
## Inputs | ||
|
||
1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW], | ||
where: | ||
- B - batch size | ||
- C - number of channels | ||
- H - image height | ||
- W - image width. | ||
Expected color order is BGR. | ||
|
||
## Outputs | ||
|
||
1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13], | ||
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively. | ||
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors. | ||
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings: | ||
* Regression parameters (4) | ||
* Objectness score (1) | ||
* Class score (20) | ||
- `y_loc` and `x_loc`: spatial location of each grid. | ||
|
||
## Legal Information | ||
[*] Same as the original implementation. | ||
|
||
[**] Other names and brands may be claimed as the property of others. |
49 changes: 49 additions & 0 deletions
49
.../yolo-tiny-v2-ava-sparse-60-0001/description/yolo-tiny-v2-ava-sparse-60-0001.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# yolo-tiny-v2-ava-sparse-60-0001 | ||
|
||
## Use Case and High-Level Description | ||
|
||
This is a re-implemented and re-trained version of [tiny YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset. | ||
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (60% of network parameters are set to zeros). | ||
|
||
## Example | ||
|
||
## Specification | ||
|
||
| Metric | Value | | ||
|---------------------------------|-------------------------------------------| | ||
| Mean Average Precision (mAP) | 35.32% | | ||
| Flops | 6.97Bn* | | ||
| Source framework | Tensorflow** | | ||
|
||
Average Precision metric described in: Mark Everingham et al. | ||
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf). | ||
|
||
Tested on VOC 2012 validation dataset. | ||
|
||
## Performance | ||
|
||
## Inputs | ||
|
||
1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW], | ||
where: | ||
- B - batch size | ||
- C - number of channels | ||
- H - image height | ||
- W - image width. | ||
Expected color order is BGR. | ||
|
||
## Outputs | ||
|
||
1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13], | ||
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively. | ||
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors. | ||
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings: | ||
* Regression parameters (4) | ||
* Objectness score (1) | ||
* Class score (20) | ||
- `y_loc` and `x_loc`: spatial location of each grid. | ||
|
||
## Legal Information | ||
[*] Same as the original implementation. | ||
|
||
[**] Other names and brands may be claimed as the property of others. |
47 changes: 47 additions & 0 deletions
47
models/intel/yolo-v2-ava-0001/description/yolo-v2-ava-0001.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# yolo-v2-ava-0001 | ||
|
||
## Use Case and High-Level Description | ||
|
||
This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset. | ||
|
||
## Example | ||
|
||
## Specification | ||
|
||
| Metric | Value | | ||
|---------------------------------|-------------------------------------------| | ||
| Mean Average Precision (mAP) | 63.9% | | ||
| Flops | 48.29Bn* | | ||
| Source framework | Tensorflow** | | ||
|
||
Average Precision metric described in: Mark Everingham et al. | ||
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf). | ||
Tested on VOC 2012 validation dataset. | ||
|
||
## Performance | ||
|
||
## Inputs | ||
|
||
1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW], | ||
where: | ||
- B - batch size | ||
- C - number of channels | ||
- H - image height | ||
- W - image width. | ||
Expected color order is BGR. | ||
|
||
## Outputs | ||
|
||
1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13], | ||
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively. | ||
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors. | ||
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings: | ||
* Regression parameters (4) | ||
* Objectness score (1) | ||
* Class score (20) | ||
- `y_loc` and `x_loc`: spatial location of each grid. | ||
|
||
## Legal Information | ||
[*] Same as the original implementation. | ||
|
||
[**] Other names and brands may be claimed as the property of others. |
48 changes: 48 additions & 0 deletions
48
models/intel/yolo-v2-ava-sparse-35-0001/description/yolo-v2-ava-sparse-35-0001.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# yolo-v2-ava-sparse-35-0001 | ||
|
||
## Use Case and High-Level Description | ||
|
||
This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset. | ||
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (35% of network parameters are set to zeros). | ||
|
||
## Example | ||
|
||
## Specification | ||
|
||
| Metric | Value | | ||
|------------------------------|--------------| | ||
| Mean Average Precision (mAP) | 63.71% | | ||
| Flops | 48.29Bn* | | ||
| Source framework | Tensorflow** | | ||
|
||
Average Precision metric described in: Mark Everingham et al. | ||
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf). | ||
Tested on VOC 2012 validation dataset. | ||
|
||
## Performance | ||
|
||
## Inputs | ||
|
||
1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW], | ||
where: | ||
- B - batch size | ||
- C - number of channels | ||
- H - image height | ||
- W - image width. | ||
Expected color order is BGR. | ||
|
||
## Outputs | ||
|
||
1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13], | ||
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively. | ||
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors. | ||
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings: | ||
* Regression parameters (4) | ||
* Objectness score (1) | ||
* Class score (20) | ||
- `y_loc` and `x_loc`: spatial location of each grid. | ||
|
||
## Legal Information | ||
[*] Same as the original implementation. | ||
|
||
[**] Other names and brands may be claimed as the property of others. |
49 changes: 49 additions & 0 deletions
49
models/intel/yolo-v2-ava-sparse-70-0001/description/yolo-v2-ava-sparse-70-0001.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# yolo-v2-ava-sparse-70-0001 | ||
|
||
## Use Case and High-Level Description | ||
|
||
This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset. | ||
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (70% of network parameters are set to zeros). | ||
|
||
## Example | ||
|
||
## Specification | ||
|
||
| Metric | Value | | ||
|------------------------------|--------------| | ||
| Mean Average Precision (mAP) | 62.9% | | ||
| Flops | 48.29Bn* | | ||
| Source framework | Tensorflow** | | ||
|
||
Average Precision metric described in: Mark Everingham et al. | ||
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf). | ||
|
||
Tested on VOC 2012 validation dataset. | ||
|
||
## Performance | ||
|
||
## Inputs | ||
|
||
1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW], | ||
where: | ||
- B - batch size | ||
- C - number of channels | ||
- H - image height | ||
- W - image width. | ||
Expected color order is BGR. | ||
|
||
## Outputs | ||
|
||
1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13], | ||
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively. | ||
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors. | ||
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings: | ||
* Regression parameters (4) | ||
* Objectness score (1) | ||
* Class score (20) | ||
- `y_loc` and `x_loc`: spatial location of each grid. | ||
|
||
## Legal Information | ||
[*] Same as the original implementation. | ||
|
||
[**] Other names and brands may be claimed as the property of others. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.