Reproduce training and benchmark results via Docker (#3)

* Add initial dockerfile that's installing dependencies * Add update docker compose file with dependencies * Use fork of DCNv2 thats compatible with pytorch 1.x * Add dockerignore and update gitignore * Update networks for updated dcn_v2 * Fix build to generate proper DCN binary for gpu * Update README * Remove vendored DCNv2 * Reform install instructions * Add instructions for using the docker container * Remove vestigial tools from CenterNet repo Fairly certain that these files are not relevant to experiments for the grasping network. * Move readme to docs * Move files into better directory structure * Add setup and fix scripts * Fix more imports * Remove from __future__ imports * Run pre-commit on all the files * Remove extra imports * Replace progress with tqdm and simplify testing * Add json logger * Add script to evaluate all networks and update loggers * Replace progress with tqdm in train * Clean up dockerfile * Remove debug script and remove progress dependency * Add dataset documentation to repo * Document structure of models and datasets * Update INSTALL and DEVELOP docs * Move demo into docs directory * Update image name to gknet * Update pytorch version typo
ivalab · Apr 5, 2023 · 66e2a13 · 66e2a13
1 parent fc9553d
commit 66e2a13
Show file tree

Hide file tree

Showing 194 changed files with 9,350 additions and 29,082 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,4 @@
+/models/
+/exp/
+/datasets/
+/data/
diff --git a/.gitignore b/.gitignore
@@ -115,3 +115,9 @@ ENV/
 
 # cache file
 *~
+
+# experiments
+/exp/
+/datasets/
+/models/
+/data/
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,35 @@
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.4.0
+    hooks:
+      - id: check-yaml
+      - id: end-of-file-fixer
+      - id: trailing-whitespace
+  - repo: https://github.com/pycqa/isort
+    rev: 5.12.0
+    hooks:
+      - id: isort
+        name: isort (python)
+        args: [--profile=black]
+  - repo: https://github.com/psf/black
+    rev: 23.1.0
+    hooks:
+      - id: black
+      - id: black-jupyter
+  - repo: https://github.com/pre-commit/mirrors-prettier
+    rev: v3.0.0-alpha.6
+    hooks:
+      - id: prettier
+        additional_dependencies:
+          - [email protected]
+          - "@prettier/[email protected]"
+      - id: prettier
+        files: .(launch|test|world)$
+        additional_dependencies:
+          - [email protected]
+          - "@prettier/[email protected]"
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.4
+    hooks:
+      - id: codespell
+        args: [--ignore-words-list=thw]
diff --git a/README.md b/README.md
@@ -1,18 +1,17 @@
-# GKNet
-Grasp Keypoint Network for Grasp Candidates Detection
+# GKNet: Grasp Keypoint Network for Grasp Candidates Detection
 
-![](https://github.com/ivalab/GraspKpNet/blob/main/demo/fig_ill_mul_resized.png)
-
->[**GKNet: grasp keypoint network for Grasp Candidates Detection**](),
+![fig_ill_mul_resized](./docs/demo/fig_ill_mul_resized.png)
 
+> **GKNet: grasp keypoint network for Grasp Candidates Detection** <br>
 > Ruinian Xu, Fu-Jen Chu and Patricio A. Vela
 
 ## Table of Contents
+
 - [Abstract](#Abstract)
 - [Highlights](#Highlights)
 - [Vision Benchmark Results](#Vision-Benchmark-Results)
-  * [Cornell](#Grasp-detection-on-the-Cornell-Dataset)
-  * [AJD](#Grasp-detection-on-the-AJD)
+  - [Cornell](#Grasp-detection-on-the-Cornell-Dataset)
+  - [AJD](#Grasp-detection-on-the-AJD)
 - [Installation](#Installation)
 - [Dataset](#Dataset)
 - [Usage](#Usage)
@@ -21,118 +20,185 @@ Grasp Keypoint Network for Grasp Candidates Detection
 - [Supplemental Material](#Supplemental-Material)
 
 ## Abstract
-Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model
-uncertainty. The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition
-networks. This paper presents a different approach to grasp detection by treating it as keypoint detection. The deep
-network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x, y, w, θ} T ,
-rather than a triplet or quartet of corner points. Decreasing the detection difficulty by grouping keypoints into pairs boosts
-performance. The addition of a non-local module into the grasp keypoint detection architecture promotes dependencies
-between a keypoint and its corresponding grasp candidate keypoint. A final filtering strategy based on discrete and
-continuous orientation prediction removes false correspondences and further improves grasp detection performance.
-GKNet, the approach presented here, achieves the best balance of accuracy and speed on the Cornell and the abridged
-Jacquard dataset (96.9% and 98.39% at 41.67 and 23.26 fps). Follow-up experiments on a manipulator evaluate GKNet
-using 4 types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping
-at varied camera angles, and bin picking. GKNet outperforms reference baselines in static and dynamic grasping
-experiments while showing robustness to grasp detection for varied camera viewpoints and bin picking experiments.
-The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks
-that provide robustness to expected nuisance factors.
+
+Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model uncertainty.
+The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition networks.
+This paper presents a different approach to grasp detection by treating it as keypoint detection.
+The deep network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x, y, w, θ} T, rather than a triplet or quartet of corner points.
+Decreasing the detection difficulty by grouping keypoints into pairs boosts performance.
+The addition of a non-local module into the grasp keypoint detection architecture promotes dependencies between a keypoint and its corresponding grasp candidate keypoint.
+A final filtering strategy based on discrete and continuous orientation prediction removes false correspondences and further improves grasp detection performance.
+GKNet, the approach presented here, achieves the best balance of accuracy and speed on the Cornell and the abridged Jacquard dataset (96.9% and 98.39% at 41.67 and 23.26 fps).
+Follow-up experiments on a manipulator evaluate GKNet using 4 types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping at varied camera angles, and bin picking.
+GKNet outperforms reference baselines in static and dynamic grasping experiments while showing robustness to grasp detection for varied camera viewpoints and bin picking experiments.
+The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks that provide robustness to expected nuisance factors.
 
 ## Highlights
-- **Accurate:** The proposed method achieves *96.9%* and *98.39%* detection rate over the Cornell Dataset and AJD, respectively.
-- **Fast:** The proposed method is capable of running at real-time speed of *41.67* FPS and *23.26* FPS over the Cornell Dataset and AJD, respectively.
+
+- **Accurate:** The proposed method achieves _96.9%_ and _98.39%_ detection rate over the Cornell Dataset and AJD, respectively.
+- **Fast:** The proposed method is capable of running at real-time speed of _41.67_ FPS and _23.26_ FPS over the Cornell Dataset and AJD, respectively.
 
 ## Vision Benchmark Results
+
 ### Grasp detection on the Cornell Dataset
-| Backbone     |  Acc (w o.f.) / %   |  Acc (w/o o.f.) / %   | Speed / FPS  |
-|:-------------|:---------------:|:---------------:|:------------:|
-|DLA           |       96.9      |       96.8      |     41.67    |
-|Hourglass-52  |       94.5      |       93.6      |     33.33    |
-|Hourglass-104 |       95.5      |       95.3      |     21.27    |
-|Resnet-18     |       96.0      |       95.7      |     66.67    |
-|Resnet-50     |       96.5      |       96.4      |     52.63    |
-|VGG-16        |       96.8      |       96.4      |     55.56    |
-|AlexNet       |       95.0      |       94.8      |     83.33    |
+
+| Backbone      | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
+| :------------ | :--------------: | :----------------: | :---------: |
+| DLA           |       96.9       |        96.8        |    41.67    |
+| Hourglass-52  |       94.5       |        93.6        |    33.33    |
+| Hourglass-104 |       95.5       |        95.3        |    21.27    |
+| Resnet-18     |       96.0       |        95.7        |    66.67    |
+| Resnet-50     |       96.5       |        96.4        |    52.63    |
+| VGG-16        |       96.8       |        96.4        |    55.56    |
+| AlexNet       |       95.0       |        94.8        |    83.33    |
 
 ### Grasp detection on the AJD
-| Backbone     |  Acc (w o.f.) / %   |  Acc (w/o o.f.) / %   | Speed / FPS  |
-|:-------------|:---------------:|:---------------:|:------------:|
-|DLA           |       98.39      |       96.99      |     23.26    |
-|Hourglass-52  |       97.21      |       93.81      |     15.87    |
-|Hourglass-104 |       97.93      |       96.04      |      9.90    |
-|Resnet-18     |       97.95      |       95.97      |     31.25    |
-|Resnet-50     |       98.24      |       95.91      |     25.00    |
-|VGG-16        |       98.36      |       96.13      |     21.28    |
-|AlexNet       |       97.37      |       94.53      |     34.48    |
+
+| Backbone      | Acc (w o.f.) / % | Acc (w/o o.f.) / % | Speed / FPS |
+| :------------ | :--------------: | :----------------: | :---------: |
+| DLA           |      98.39       |       96.99        |    23.26    |
+| Hourglass-52  |      97.21       |       93.81        |    15.87    |
+| Hourglass-104 |      97.93       |       96.04        |    9.90     |
+| Resnet-18     |      97.95       |       95.97        |    31.25    |
+| Resnet-50     |      98.24       |       95.91        |    25.00    |
+| VGG-16        |      98.36       |       96.13        |    21.28    |
+| AlexNet       |      97.37       |       94.53        |    34.48    |
 
 ## Installation
+
 Please refer to for [INSTALL.md](readme/INSTALL.md) installation instructions.
 
 ## Dataset
+
 The two training datasets are provided here:
-- Cornell: [Download link](https://www.dropbox.com/sh/x4t8p2wrqnfevo3/AAC2gLawRtm-986_JWxE0w0Za?dl=0). In case the download link expires in the future, you can also use the matlab scripts provided in the `GKNet_ROOT/scripts/data_aug` to generate your own dataset based on the original Cornell dataset. You will need to modify the corresponding path for loading the input images and output files.
-- Abridged Jacquard Dataset (AJD):[Download link](https://smartech.gatech.edu/handle/1853/64897).
+
+- Cornell: [Download link](https://www.dropbox.com/sh/x4t8p2wrqnfevo3/AAC2gLawRtm-986_JWxE0w0Za?dl=0).
+  In case the download link expires in the future, you can also use the matlab scripts provided in the `GKNet_ROOT/scripts/data_aug` to generate your own dataset based on the original Cornell dataset.
+  You will need to modify the corresponding path for loading the input images and output files.
+- Abridged Jacquard Dataset (AJD): [Download link](https://smartech.gatech.edu/handle/1853/64897).
 
 ## Usage
-After downloading datasets, place each dataset in the corresponding folder under `GKNet_ROOT/Dataset/`. 
+
+After downloading datasets, place each dataset in the corresponding folder under `GKNet_ROOT/datasets/`.
+The cornell dataset should be placed under `GKNet_ROOT/datasets/Cornell/` and the AJD should be placed under `GKNet_ROOT/datasets/Jacquard/`.
 Download models [ctdet_coco_dla_2x](https://www.dropbox.com/sh/eicrmhhay2wi8fy/AAAGrToUcdp0tO-F732Xhsxwa?dl=0) and put it under `GKNet_ROOT/models/`.
+
 ### Training
+
 For training the Cornell Dataset:
-~~~
-python main.py dbmctdet_cornell --exp_id dla34 --batch_size 4 --lr 1.25e-4 --arch dla_34 --dataset cornell --load_model ../models/ctdet_coco_dla_2x.pth --num_epochs 15 --val_intervals 1 --save_all --lr_step 5,10
-~~~
+
+```bash
+python scripts/train.py dbmctdet_cornell \
+  --exp_id dla34 \
+  --batch_size 4 \
+  --lr 1.25e-4 \
+  --arch dla_34 \
+  --dataset cornell \
+  --load_model models/ctdet_coco_dla_2x.pth \
+  --num_epochs 15 \
+  --val_intervals 1 \
+  --save_all \
+  --lr_step 5,10
+```
 
 For training AJD:
-~~~
-python main.py dbmctdet --exp_id dla34 --batch_size 4 --lr 1.25e-4 --arch dla_34 --dataset jac_coco_36 --load_model ../models/ctdet_coco_dla_2x.pth --num_epochs 30 --val_intervals 1 --save_all
-~~~
+
+```bash
+python scripts/train.py dbmctdet \
+  --exp_id dla34 \
+  --batch_size 4 \
+  --lr 1.25e-4 \
+  --arch dla_34 \
+  --dataset jac_coco_36 \
+  --load_model models/ctdet_coco_dla_2x.pth \
+  --num_epochs 30 \
+  --val_intervals 1 \
+  --save_all
+```
 
 ### Evaluation
+
 You can evaluate your own trained models or download [pretrained models](https://www.dropbox.com/sh/eicrmhhay2wi8fy/AAAGrToUcdp0tO-F732Xhsxwa?dl=0) and put them under `GKNet_ROOT/models/`.
 
 For evaluating the Cornell Dataset:
-~~~
-python test.py dbmctdet_cornell --exp_id dla34_test --arch dla_34 --dataset cornell --fix_res --flag_test --load_model ../models/model_dla34_cornell.pth --ae_threshold 1.0 --ori_threshold 0.24 --center_threshold 0.05
-~~~
+
+```bash
+python scripts/test.py dbmctdet_cornell \
+  --exp_id dla34_test \
+  --arch dla_34 \
+  --dataset cornell \
+  --fix_res \
+  --flag_test \
+  --load_model models/model_dla34_cornell.pth \
+  --ae_threshold 1.0 \
+  --ori_threshold 0.24 \
+  --center_threshold 0.05
+```
 
 For evaluating AJD:
-~~~
-python test.py dbmctdet --exp_id dla34_test --arch dla_34 --dataset jac_coco_36 --fix_res --flag_test --load_model ../models/model_dla34_ajd.pth --ae_threshold 0.65 --ori_threshold 0.1745 --center_threshold 0.15
-~~~
+
+```bash
+python scripts/test.py dbmctdet \
+  --exp_id dla34_test \
+  --arch dla_34 \
+  --dataset jac_coco_36 \
+  --fix_res \
+  --flag_test \
+  --load_model models/model_dla34_ajd.pth \
+  --ae_threshold 0.65 \
+  --ori_threshold 0.1745 \
+  --center_threshold 0.15
+```
 
 ## Develop
-If you are interested in training GKNet on a new or customized dataset, please refer to [DEVELOP.md](https://github.com/ivalab/GraspKpNet/blob/master/readme/DEVELOP.md). Also you can leave your issues here if you meet some problems.
+
+If you are interested in training GKNet on a new or customized dataset, please refer to [DEVELOP.md](./docs/DEVELOP.md).
+Also you can leave your issues here if you meet some problems.
 
 ## Physical Experiments
-To run physical experiments with GKNet and ROS, please follow the instructions provided in [Experiment.md](https://github.com/ivalab/GraspKpNet/blob/master/readme/experiment.md).
+
+To run physical experiments with GKNet and ROS, please follow the instructions provided in [Experiment.md](./docs/experiment.md).
 
 ## Supplemental Material
+
 This section collects results of some experiments or discussions which aren't documented in the manuscript due to the lack of enough scientific values.
 
 ### Keypoint representation
-This [readme](https://github.com/ivalab/GraspKpNet/blob/main/readme/kp_rep.md) file documents some examples with visualiztions for Top-left, bottom-left and bottom-right (TlBlBr) grasp keypoint representation. These
-examples help clarify the effectiveness of grasp keypoint representation of less number of keypoints.
+
+This [readme](./docs/kp_rep.md) file documents some examples with visualiztions for Top-left, bottom-left and bottom-right (TlBlBr) grasp keypoint representation.
+Theseexamples help clarify the effectiveness of grasp keypoint representation of less number of keypoints.
 
 ### Tuning hyper-parameters of alpha, beta and gamma.
-The result is recorded in [tune_hp.md](https://github.com/ivalab/GraspKpNet/blob/main/readme/tune_kp.md) 
+
+The result is recorded in [tune_hp.md](./docs/tune_kp.md)
 
 ### Demo video
+
 The demo video of all physical experiments are uploaded on the [Youtube](https://www.youtube.com/watch?v=Q8-Kr8Q9vC0). Please watch it if you are interested.
 
-### Detailed Result and Tables 
-Some of the source data was summarized with the raw source data not provided.  The links below provide access to the source material:
-- [Trial results of bin picking](https://github.com/ivalab/GraspKpNet/blob/main/readme/bin_picking.md) experiment.
-- [6-DoF summary results](https://github.com/ivalab/GraspKpNet/blob/main/readme/bin_picking_6DoF.md) for clutter clearance or bin-picking tasks.
+### Detailed Result and Tables
+
+Some of the source data was summarized with the raw source data not provided.
+The links below provide access to the source material:
+
+- [Trial results of bin picking](./docs/bin_picking.md) experiment.
+- [6-DoF summary results](./docs/bin_picking_6DoF.md) for clutter clearance or bin-picking tasks.
 
 ### Implementation of GGCNN
-Considering that GGCNN didn't provide the result of training and testing on the Cornell Dataset, we implemented their work based on their public
-repository. The modified version is provided [here](https://github.com/ivalab/ggcnn).
+
+Considering that GGCNN didn't provide the result of training and testing on the Cornell Dataset, we implemented their work based on their public repository.
+The modified version is provided [here](https://github.com/ivalab/ggcnn).
 
 ## License
+
 GKNet is released under the MIT License (refer to the LICENSE file for details).
-Portions of the code are borrowed from [CenterNet](https://github.com/xingyizhou/CenterNet), [dla](https://github.com/ucbdrive/dla) (DLA network), [DCNv2](https://github.com/CharlesShang/DCNv2)(deformable convolutions). Please refer to the original License of these projects (See [Notice](https://github.com/ivalab/GKNet/blob/master/NOTICE)).
+Portions of the code are borrowed from [CenterNet](https://github.com/xingyizhou/CenterNet), [dla](https://github.com/ucbdrive/dla) (DLA network), [DCNv2](https://github.com/CharlesShang/DCNv2)(deformable convolutions).
+Please refer to the original License of these projects (See [Notice](https://github.com/ivalab/GKNet/blob/master/NOTICE)).
 
 ## Citation
+
 If you use GKNet in your work, please cite:
+
 ```
 @article{xu2021gknet,
   title={GKNet: grasp keypoint network for grasp candidates detection},

diff --git a/docker-compose.yml b/docker-compose.yml
@@ -0,0 +1,25 @@
+# Launch a ros master with the controller manager and associated services
+version: "3.8"
+
+services:
+  # base container -- will simply exit once brought up
+  # we can run commands via this container for running experiments, etc.
+  base: &base
+    build:
+      context: .
+      dockerfile: docker/Dockerfile.noetic
+    image: ivalab/gknet:latest
+    network_mode: host
+    shm_size: 2gb
+    volumes:
+      - ./:/app/
+      - .cache/torch:/root/.cache/torch
+  gpu:
+    <<: *base
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]