Skip to content

Commit

Permalink
Reproduce training and benchmark results via Docker (#3)
Browse files Browse the repository at this point in the history
* Add initial dockerfile that's installing dependencies

* Add update docker compose file with dependencies

* Use fork of DCNv2 thats compatible with pytorch 1.x

* Add dockerignore and update gitignore

* Update networks for updated dcn_v2

* Fix build to generate proper DCN binary for gpu

* Update README

* Remove vendored DCNv2

* Reform install instructions

* Add instructions for using the docker container

* Remove vestigial tools from CenterNet repo

Fairly certain that these files are not relevant to experiments for the
grasping network.

* Move readme to docs

* Move files into better directory structure

* Add setup and fix scripts

* Fix more imports

* Remove from __future__ imports

* Run pre-commit on all the files

* Remove extra imports

* Replace progress with tqdm and simplify testing

* Add json logger

* Add script to evaluate all networks and update loggers

* Replace progress with tqdm in train

* Clean up dockerfile

* Remove debug script and remove progress dependency

* Add dataset documentation to repo

* Document structure of models and datasets

* Update INSTALL and DEVELOP docs

* Move demo into docs directory

* Update image name to gknet

* Update pytorch version typo
  • Loading branch information
acmiyaguchi authored Apr 5, 2023
1 parent fc9553d commit 66e2a13
Show file tree
Hide file tree
Showing 194 changed files with 9,350 additions and 29,082 deletions.
4 changes: 4 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/models/
/exp/
/datasets/
/data/
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,9 @@ ENV/

# cache file
*~

# experiments
/exp/
/datasets/
/models/
/data/
35 changes: 35 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
name: isort (python)
args: [--profile=black]
- repo: https://github.com/psf/black
rev: 23.1.0
hooks:
- id: black
- id: black-jupyter
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.0.0-alpha.6
hooks:
- id: prettier
additional_dependencies:
- [email protected]
- "@prettier/[email protected]"
- id: prettier
files: .(launch|test|world)$
additional_dependencies:
- [email protected]
- "@prettier/[email protected]"
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
args: [--ignore-words-list=thw]
204 changes: 135 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
# GKNet
Grasp Keypoint Network for Grasp Candidates Detection
# GKNet: Grasp Keypoint Network for Grasp Candidates Detection

![](https://github.com/ivalab/GraspKpNet/blob/main/demo/fig_ill_mul_resized.png)

>[**GKNet: grasp keypoint network for Grasp Candidates Detection**](),
![fig_ill_mul_resized](./docs/demo/fig_ill_mul_resized.png)

> **GKNet: grasp keypoint network for Grasp Candidates Detection** <br>
> Ruinian Xu, Fu-Jen Chu and Patricio A. Vela
## Table of Contents

- [Abstract](#Abstract)
- [Highlights](#Highlights)
- [Vision Benchmark Results](#Vision-Benchmark-Results)
* [Cornell](#Grasp-detection-on-the-Cornell-Dataset)
* [AJD](#Grasp-detection-on-the-AJD)
- [Cornell](#Grasp-detection-on-the-Cornell-Dataset)
- [AJD](#Grasp-detection-on-the-AJD)
- [Installation](#Installation)
- [Dataset](#Dataset)
- [Usage](#Usage)
Expand All @@ -21,118 +20,185 @@ Grasp Keypoint Network for Grasp Candidates Detection
- [Supplemental Material](#Supplemental-Material)

## Abstract
Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model
uncertainty. The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition
networks. This paper presents a different approach to grasp detection by treating it as keypoint detection. The deep
network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x, y, w, θ} T ,
rather than a triplet or quartet of corner points. Decreasing the detection difficulty by grouping keypoints into pairs boosts
performance. The addition of a non-local module into the grasp keypoint detection architecture promotes dependencies
between a keypoint and its corresponding grasp candidate keypoint. A final filtering strategy based on discrete and
continuous orientation prediction removes false correspondences and further improves grasp detection performance.
GKNet, the approach presented here, achieves the best balance of accuracy and speed on the Cornell and the abridged
Jacquard dataset (96.9% and 98.39% at 41.67 and 23.26 fps). Follow-up experiments on a manipulator evaluate GKNet
using 4 types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping
at varied camera angles, and bin picking. GKNet outperforms reference baselines in static and dynamic grasping
experiments while showing robustness to grasp detection for varied camera viewpoints and bin picking experiments.
The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks
that provide robustness to expected nuisance factors.

Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model uncertainty.
The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition networks.
This paper presents a different approach to grasp detection by treating it as keypoint detection.
The deep network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x, y, w, θ} T, rather than a triplet or quartet of corner points.
Decreasing the detection difficulty by grouping keypoints into pairs boosts performance.
The addition of a non-local module into the grasp keypoint detection architecture promotes dependencies between a keypoint and its corresponding grasp candidate keypoint.
A final filtering strategy based on discrete and continuous orientation prediction removes false correspondences and further improves grasp detection performance.
GKNet, the approach presented here, achieves the best balance of accuracy and speed on the Cornell and the abridged Jacquard dataset (96.9% and 98.39% at 41.67 and 23.26 fps).
Follow-up experiments on a manipulator evaluate GKNet using 4 types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping at varied camera angles, and bin picking.
GKNet outperforms reference baselines in static and dynamic grasping experiments while showing robustness to grasp detection for varied camera viewpoints and bin picking experiments.
The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks that provide robustness to expected nuisance factors.

## Highlights
- **Accurate:** The proposed method achieves *96.9%* and *98.39%* detection rate over the Cornell Dataset and AJD, respectively.
- **Fast:** The proposed method is capable of running at real-time speed of *41.67* FPS and *23.26* FPS over the Cornell Dataset and AJD, respectively.

- **Accurate:** The proposed method achieves _96.9%_ and _98.39%_ detection rate over the Cornell Dataset and AJD, respectively.
- **Fast:** The proposed method is capable of running at real-time speed of _41.67_ FPS and _23.26_ FPS over the Cornell Dataset and AJD, respectively.

## Vision Benchmark Results

### Grasp detection on the Cornell Dataset
| Backbone | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
|:-------------|:---------------:|:---------------:|:------------:|
|DLA | 96.9 | 96.8 | 41.67 |
|Hourglass-52 | 94.5 | 93.6 | 33.33 |
|Hourglass-104 | 95.5 | 95.3 | 21.27 |
|Resnet-18 | 96.0 | 95.7 | 66.67 |
|Resnet-50 | 96.5 | 96.4 | 52.63 |
|VGG-16 | 96.8 | 96.4 | 55.56 |
|AlexNet | 95.0 | 94.8 | 83.33 |

| Backbone | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
| :------------ | :--------------: | :----------------: | :---------: |
| DLA | 96.9 | 96.8 | 41.67 |
| Hourglass-52 | 94.5 | 93.6 | 33.33 |
| Hourglass-104 | 95.5 | 95.3 | 21.27 |
| Resnet-18 | 96.0 | 95.7 | 66.67 |
| Resnet-50 | 96.5 | 96.4 | 52.63 |
| VGG-16 | 96.8 | 96.4 | 55.56 |
| AlexNet | 95.0 | 94.8 | 83.33 |

### Grasp detection on the AJD
| Backbone | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
|:-------------|:---------------:|:---------------:|:------------:|
|DLA | 98.39 | 96.99 | 23.26 |
|Hourglass-52 | 97.21 | 93.81 | 15.87 |
|Hourglass-104 | 97.93 | 96.04 | 9.90 |
|Resnet-18 | 97.95 | 95.97 | 31.25 |
|Resnet-50 | 98.24 | 95.91 | 25.00 |
|VGG-16 | 98.36 | 96.13 | 21.28 |
|AlexNet | 97.37 | 94.53 | 34.48 |

| Backbone | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
| :------------ | :--------------: | :----------------: | :---------: |
| DLA | 98.39 | 96.99 | 23.26 |
| Hourglass-52 | 97.21 | 93.81 | 15.87 |
| Hourglass-104 | 97.93 | 96.04 | 9.90 |
| Resnet-18 | 97.95 | 95.97 | 31.25 |
| Resnet-50 | 98.24 | 95.91 | 25.00 |
| VGG-16 | 98.36 | 96.13 | 21.28 |
| AlexNet | 97.37 | 94.53 | 34.48 |

## Installation

Please refer to for [INSTALL.md](readme/INSTALL.md) installation instructions.

## Dataset

The two training datasets are provided here:
- Cornell: [Download link](https://www.dropbox.com/sh/x4t8p2wrqnfevo3/AAC2gLawRtm-986_JWxE0w0Za?dl=0). In case the download link expires in the future, you can also use the matlab scripts provided in the `GKNet_ROOT/scripts/data_aug` to generate your own dataset based on the original Cornell dataset. You will need to modify the corresponding path for loading the input images and output files.
- Abridged Jacquard Dataset (AJD):[Download link](https://smartech.gatech.edu/handle/1853/64897).

- Cornell: [Download link](https://www.dropbox.com/sh/x4t8p2wrqnfevo3/AAC2gLawRtm-986_JWxE0w0Za?dl=0).
In case the download link expires in the future, you can also use the matlab scripts provided in the `GKNet_ROOT/scripts/data_aug` to generate your own dataset based on the original Cornell dataset.
You will need to modify the corresponding path for loading the input images and output files.
- Abridged Jacquard Dataset (AJD): [Download link](https://smartech.gatech.edu/handle/1853/64897).

## Usage
After downloading datasets, place each dataset in the corresponding folder under `GKNet_ROOT/Dataset/`.

After downloading datasets, place each dataset in the corresponding folder under `GKNet_ROOT/datasets/`.
The cornell dataset should be placed under `GKNet_ROOT/datasets/Cornell/` and the AJD should be placed under `GKNet_ROOT/datasets/Jacquard/`.
Download models [ctdet_coco_dla_2x](https://www.dropbox.com/sh/eicrmhhay2wi8fy/AAAGrToUcdp0tO-F732Xhsxwa?dl=0) and put it under `GKNet_ROOT/models/`.

### Training

For training the Cornell Dataset:
~~~
python main.py dbmctdet_cornell --exp_id dla34 --batch_size 4 --lr 1.25e-4 --arch dla_34 --dataset cornell --load_model ../models/ctdet_coco_dla_2x.pth --num_epochs 15 --val_intervals 1 --save_all --lr_step 5,10
~~~

```bash
python scripts/train.py dbmctdet_cornell \
--exp_id dla34 \
--batch_size 4 \
--lr 1.25e-4 \
--arch dla_34 \
--dataset cornell \
--load_model models/ctdet_coco_dla_2x.pth \
--num_epochs 15 \
--val_intervals 1 \
--save_all \
--lr_step 5,10
```

For training AJD:
~~~
python main.py dbmctdet --exp_id dla34 --batch_size 4 --lr 1.25e-4 --arch dla_34 --dataset jac_coco_36 --load_model ../models/ctdet_coco_dla_2x.pth --num_epochs 30 --val_intervals 1 --save_all
~~~

```bash
python scripts/train.py dbmctdet \
--exp_id dla34 \
--batch_size 4 \
--lr 1.25e-4 \
--arch dla_34 \
--dataset jac_coco_36 \
--load_model models/ctdet_coco_dla_2x.pth \
--num_epochs 30 \
--val_intervals 1 \
--save_all
```

### Evaluation

You can evaluate your own trained models or download [pretrained models](https://www.dropbox.com/sh/eicrmhhay2wi8fy/AAAGrToUcdp0tO-F732Xhsxwa?dl=0) and put them under `GKNet_ROOT/models/`.

For evaluating the Cornell Dataset:
~~~
python test.py dbmctdet_cornell --exp_id dla34_test --arch dla_34 --dataset cornell --fix_res --flag_test --load_model ../models/model_dla34_cornell.pth --ae_threshold 1.0 --ori_threshold 0.24 --center_threshold 0.05
~~~

```bash
python scripts/test.py dbmctdet_cornell \
--exp_id dla34_test \
--arch dla_34 \
--dataset cornell \
--fix_res \
--flag_test \
--load_model models/model_dla34_cornell.pth \
--ae_threshold 1.0 \
--ori_threshold 0.24 \
--center_threshold 0.05
```

For evaluating AJD:
~~~
python test.py dbmctdet --exp_id dla34_test --arch dla_34 --dataset jac_coco_36 --fix_res --flag_test --load_model ../models/model_dla34_ajd.pth --ae_threshold 0.65 --ori_threshold 0.1745 --center_threshold 0.15
~~~

```bash
python scripts/test.py dbmctdet \
--exp_id dla34_test \
--arch dla_34 \
--dataset jac_coco_36 \
--fix_res \
--flag_test \
--load_model models/model_dla34_ajd.pth \
--ae_threshold 0.65 \
--ori_threshold 0.1745 \
--center_threshold 0.15
```

## Develop
If you are interested in training GKNet on a new or customized dataset, please refer to [DEVELOP.md](https://github.com/ivalab/GraspKpNet/blob/master/readme/DEVELOP.md). Also you can leave your issues here if you meet some problems.

If you are interested in training GKNet on a new or customized dataset, please refer to [DEVELOP.md](./docs/DEVELOP.md).
Also you can leave your issues here if you meet some problems.

## Physical Experiments
To run physical experiments with GKNet and ROS, please follow the instructions provided in [Experiment.md](https://github.com/ivalab/GraspKpNet/blob/master/readme/experiment.md).

To run physical experiments with GKNet and ROS, please follow the instructions provided in [Experiment.md](./docs/experiment.md).

## Supplemental Material

This section collects results of some experiments or discussions which aren't documented in the manuscript due to the lack of enough scientific values.

### Keypoint representation
This [readme](https://github.com/ivalab/GraspKpNet/blob/main/readme/kp_rep.md) file documents some examples with visualiztions for Top-left, bottom-left and bottom-right (TlBlBr) grasp keypoint representation. These
examples help clarify the effectiveness of grasp keypoint representation of less number of keypoints.

This [readme](./docs/kp_rep.md) file documents some examples with visualiztions for Top-left, bottom-left and bottom-right (TlBlBr) grasp keypoint representation.
Theseexamples help clarify the effectiveness of grasp keypoint representation of less number of keypoints.

### Tuning hyper-parameters of alpha, beta and gamma.
The result is recorded in [tune_hp.md](https://github.com/ivalab/GraspKpNet/blob/main/readme/tune_kp.md)

The result is recorded in [tune_hp.md](./docs/tune_kp.md)

### Demo video

The demo video of all physical experiments are uploaded on the [Youtube](https://www.youtube.com/watch?v=Q8-Kr8Q9vC0). Please watch it if you are interested.

### Detailed Result and Tables
Some of the source data was summarized with the raw source data not provided. The links below provide access to the source material:
- [Trial results of bin picking](https://github.com/ivalab/GraspKpNet/blob/main/readme/bin_picking.md) experiment.
- [6-DoF summary results](https://github.com/ivalab/GraspKpNet/blob/main/readme/bin_picking_6DoF.md) for clutter clearance or bin-picking tasks.
### Detailed Result and Tables

Some of the source data was summarized with the raw source data not provided.
The links below provide access to the source material:

- [Trial results of bin picking](./docs/bin_picking.md) experiment.
- [6-DoF summary results](./docs/bin_picking_6DoF.md) for clutter clearance or bin-picking tasks.

### Implementation of GGCNN
Considering that GGCNN didn't provide the result of training and testing on the Cornell Dataset, we implemented their work based on their public
repository. The modified version is provided [here](https://github.com/ivalab/ggcnn).

Considering that GGCNN didn't provide the result of training and testing on the Cornell Dataset, we implemented their work based on their public repository.
The modified version is provided [here](https://github.com/ivalab/ggcnn).

## License

GKNet is released under the MIT License (refer to the LICENSE file for details).
Portions of the code are borrowed from [CenterNet](https://github.com/xingyizhou/CenterNet), [dla](https://github.com/ucbdrive/dla) (DLA network), [DCNv2](https://github.com/CharlesShang/DCNv2)(deformable convolutions). Please refer to the original License of these projects (See [Notice](https://github.com/ivalab/GKNet/blob/master/NOTICE)).
Portions of the code are borrowed from [CenterNet](https://github.com/xingyizhou/CenterNet), [dla](https://github.com/ucbdrive/dla) (DLA network), [DCNv2](https://github.com/CharlesShang/DCNv2)(deformable convolutions).
Please refer to the original License of these projects (See [Notice](https://github.com/ivalab/GKNet/blob/master/NOTICE)).

## Citation

If you use GKNet in your work, please cite:

```
@article{xu2021gknet,
title={GKNet: grasp keypoint network for grasp candidates detection},
Expand Down
25 changes: 25 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Launch a ros master with the controller manager and associated services
version: "3.8"

services:
# base container -- will simply exit once brought up
# we can run commands via this container for running experiments, etc.
base: &base
build:
context: .
dockerfile: docker/Dockerfile.noetic
image: ivalab/gknet:latest
network_mode: host
shm_size: 2gb
volumes:
- ./:/app/
- .cache/torch:/root/.cache/torch
gpu:
<<: *base
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Loading

0 comments on commit 66e2a13

Please sign in to comment.