Skip to content

Commit

Permalink
Add YOLO and tiny YOLO v2 models w/ sparsities.
Browse files Browse the repository at this point in the history
Add descriptions for all three models.

Add model link.

Update yolo-tiny-v2-ava-0001.md

Update yolo-tiny-v2-ava-0001.md

Update yolo-tiny-v2-ava-0001.md

Update yolo-tiny-v2-ava-sparse-30-0001.md

Update yolo-tiny-v2-ava-sparse-30-0001.md

Update yolo-tiny-v2-ava-sparse-30-0001.md

Update yolo-tiny-v2-ava-sparse-60-0001.md

Update yolo-tiny-v2-ava-0001.md

Update yolo-tiny-v2-ava-0001.md

Add sparse YOLO v2 models.

Update models/intel/yolo-tiny-v2-ava-0001/description/yolo-tiny-v2-ava-0001.md

Co-Authored-By: Roman Donchenko <[email protected]>

add FP16 configs

fix models readme names
  • Loading branch information
eldercrow committed Mar 3, 2020
1 parent d9ec041 commit b0370bb
Show file tree
Hide file tree
Showing 13 changed files with 439 additions and 5 deletions.
6 changes: 6 additions & 0 deletions models/intel/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ network to detect objects of the same type better.
| [vehicle-license-plate-detection-barrier-0106](./vehicle-license-plate-detection-barrier-0106/description/vehicle-license-plate-detection-barrier-0106.md) | 0.349 | 0.634 | | | X | | X | |
| [product-detection-0001](./product-detection-0001/description/product-detection-0001.md) | 3.598 | 3.212 | | | | | | X |
| [person-detection-asl-0001](./person-detection-asl-0001/description/person-detection-asl-0001.md) | 0.986 | 1.338 | | X | | | | |
| [yolo-v2-ava-0001](./yolo-v2-ava-0001/description/yolo-v2-ava-0001.md) | 29.38 | 48.29 | | X | X | X | | |
| [yolo-v2-ava-sparse-35-0001](./yolo-v2-ava-sparse-35-0001/description/yolo-v2-ava-sparse-35-0001.md) | 29.38 | 48.29 | | X | X | X | | |
| [yolo-v2-ava-sparse-70-0001](./yolo-v2-ava-sparse-70-0001/description/yolo-v2-ava-sparse-70-0001.md) | 29.38 | 48.29 | | X | X | X | | |
| [yolo-tiny-v2-ava-0001](./yolo-tiny-v2-ava-0001/description/yolo-tiny-v2-ava-0001.md) | 6.975 | 15.12 | | X | X | X | | |
| [yolo-tiny-v2-ava-sparse-30-0001](./yolo-tiny-v2-ava-sparse-30-0001/description/yolo-tiny-v2-ava-sparse-30-0001.md) | 6.975 | 15.12 | | X | X | X | | |
| [yolo-tiny-v2-ava-sparse-60-0001](./yolo-tiny-v2-ava-sparse-60-0001/description/yolo-tiny-v2-ava-sparse-60-0001.md) | 6.975 | 15.12 | | X | X | X | | |


## Object Recognition Models
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Use Case and High-Level Description

This is a re-implemented and re-trained version of tiny YOLO v2 object detection network trained with VOC2012 training dataset.
This is a re-implemented and re-trained version of [tiny YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.

## Example

Expand All @@ -11,12 +11,11 @@ This is a re-implemented and re-trained version of tiny YOLO v2 object detection
| Metric | Value |
|---------------------------------|-------------------------------------------|
| Mean Average Precision (mAP) | 35.37% |
| Flops | 6.97Bn |
| Source framework | Tensorflow* |
| Flops | 6.97Bn* |
| Source framework | TensorFlow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).

Tested on VOC 2012 validation dataset.

## Performance
Expand All @@ -43,4 +42,6 @@ Tested on VOC 2012 validation dataset.
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Other names and brands may be claimed as the property of others.
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# yolo-tiny-v2-ava-sparse-30-0001

## Use Case and High-Level Description

This is a re-implemented and re-trained version of [tiny YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (30% of network parameters are set to zeros).

## Example

## Specification

| Metric | Value |
|---------------------------------|-------------------------------------------|
| Mean Average Precision (mAP) | 36.33% |
| Flops | 6.97Bn* |
| Source framework | Tensorflow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).
Tested on VOC 2012 validation dataset.

## Performance

## Inputs

1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
where:
- B - batch size
- C - number of channels
- H - image height
- W - image width.
Expected color order is BGR.

## Outputs

1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13],
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively.
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors.
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings:
* Regression parameters (4)
* Objectness score (1)
* Class score (20)
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# yolo-tiny-v2-ava-sparse-60-0001

## Use Case and High-Level Description

This is a re-implemented and re-trained version of [tiny YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (60% of network parameters are set to zeros).

## Example

## Specification

| Metric | Value |
|---------------------------------|-------------------------------------------|
| Mean Average Precision (mAP) | 35.32% |
| Flops | 6.97Bn* |
| Source framework | Tensorflow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).

Tested on VOC 2012 validation dataset.

## Performance

## Inputs

1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
where:
- B - batch size
- C - number of channels
- H - image height
- W - image width.
Expected color order is BGR.

## Outputs

1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13],
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively.
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors.
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings:
* Regression parameters (4)
* Objectness score (1)
* Class score (20)
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
47 changes: 47 additions & 0 deletions models/intel/yolo-v2-ava-0001/description/yolo-v2-ava-0001.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# yolo-v2-ava-0001

## Use Case and High-Level Description

This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.

## Example

## Specification

| Metric | Value |
|---------------------------------|-------------------------------------------|
| Mean Average Precision (mAP) | 63.9% |
| Flops | 48.29Bn* |
| Source framework | Tensorflow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).
Tested on VOC 2012 validation dataset.

## Performance

## Inputs

1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
where:
- B - batch size
- C - number of channels
- H - image height
- W - image width.
Expected color order is BGR.

## Outputs

1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13],
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively.
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors.
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings:
* Regression parameters (4)
* Objectness score (1)
* Class score (20)
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# yolo-v2-ava-sparse-35-0001

## Use Case and High-Level Description

This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (35% of network parameters are set to zeros).

## Example

## Specification

| Metric | Value |
|------------------------------|--------------|
| Mean Average Precision (mAP) | 63.71% |
| Flops | 48.29Bn* |
| Source framework | Tensorflow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).
Tested on VOC 2012 validation dataset.

## Performance

## Inputs

1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
where:
- B - batch size
- C - number of channels
- H - image height
- W - image width.
Expected color order is BGR.

## Outputs

1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13],
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively.
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors.
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings:
* Regression parameters (4)
* Objectness score (1)
* Class score (20)
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# yolo-v2-ava-sparse-70-0001

## Use Case and High-Level Description

This is a re-implemented and re-trained version of [YOLO v2](https://arxiv.org/abs/1612.08242) object detection network trained with VOC2012 training dataset.
[Network weight pruning](https://arxiv.org/abs/1710.01878) is applied to sparsify convolution layers (70% of network parameters are set to zeros).

## Example

## Specification

| Metric | Value |
|------------------------------|--------------|
| Mean Average Precision (mAP) | 62.9% |
| Flops | 48.29Bn* |
| Source framework | Tensorflow** |

Average Precision metric described in: Mark Everingham et al.
["The PASCAL Visual Object Classes (VOC) Challenge"](http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf).

Tested on VOC 2012 validation dataset.

## Performance

## Inputs

1. name: "input" , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
where:
- B - batch size
- C - number of channels
- H - image height
- W - image width.
Expected color order is BGR.

## Outputs

1. The net outputs a blob with the shape: [1, 21125], which can be reshaped to [5, 25, 13, 13],
where each number corresponds to [`num_anchors`, `cls_reg_obj_params`, `y_loc`, `x_loc`], respectively.
- `num_anchors`: number of anchor boxes, each spatial location specified by `y_loc` and `x_loc` has 5 anchors.
- `cls_reg_obj_params`: parameters for classification and regression. The values are made up of the followings:
* Regression parameters (4)
* Objectness score (1)
* Class score (20)
- `y_loc` and `x_loc`: spatial location of each grid.

## Legal Information
[*] Same as the original implementation.

[**] Other names and brands may be claimed as the property of others.
16 changes: 16 additions & 0 deletions tools/accuracy_checker/configs/yolo-tiny-v2-ava-0001.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16
model: intel/yolo-tiny-v2-ava-0001/FP16/yolo-tiny-v2-ava-0001.xml
weights: intel/yolo-tiny-v2-ava-0001/FP16/yolo-tiny-v2-ava-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP32-INT8
Expand All @@ -18,6 +26,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16-INT8
model: intel/yolo-tiny-v2-ava-0001/FP16-INT8/yolo-tiny-v2-ava-0001.xml
weights: intel/yolo-tiny-v2-ava-0001/FP16-INT8/yolo-tiny-v2-ava-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
datasets:
- name: VOC2012_without_background

Expand Down
16 changes: 16 additions & 0 deletions tools/accuracy_checker/configs/yolo-tiny-v2-ava-sparse-30-0001.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16
model: intel/yolo-tiny-v2-ava-sparse-30-0001/FP16/yolo-tiny-v2-ava-sparse-30-0001.xml
weights: intel/yolo-tiny-v2-ava-sparse-30-0001/FP16/yolo-tiny-v2-ava-sparse-30-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP32-INT8
Expand All @@ -18,6 +26,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16-INT8
model: intel/yolo-tiny-v2-ava-sparse-30-0001/FP16-INT8/yolo-tiny-v2-ava-sparse-30-0001.xml
weights: intel/yolo-tiny-v2-ava-sparse-30-0001/FP16-INT8/yolo-tiny-v2-ava-sparse-30-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
datasets:
- name: VOC2012_without_background

Expand Down
16 changes: 16 additions & 0 deletions tools/accuracy_checker/configs/yolo-tiny-v2-ava-sparse-60-0001.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16
model: intel/yolo-tiny-v2-ava-sparse-60-0001/FP16/yolo-tiny-v2-ava-sparse-60-0001.xml
weights: intel/yolo-tiny-v2-ava-sparse-60-0001/FP16/yolo-tiny-v2-ava-sparse-60-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP32-INT8
Expand All @@ -18,6 +26,14 @@ models:
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
- framework: dlsdk
tags:
- FP16-INT8
model: intel/yolo-tiny-v2-ava-sparse-60-0001/FP16-INT8/yolo-tiny-v2-ava-sparse-60-0001.xml
weights: intel/yolo-tiny-v2-ava-sparse-60-0001/FP16-INT8/yolo-tiny-v2-ava-sparse-60-0001.bin
adapter:
type: yolo_v2
anchors: tiny_yolo_v2
datasets:
- name: VOC2012_without_background

Expand Down
Loading

0 comments on commit b0370bb

Please sign in to comment.