Special Feature for This Caffe Repository
- Clone from D-X-Y/caffe-faster-rcnn
commit 8ba1d26
, will continuely updating. - support FPN (Feature Pyramid Network)
- script for merging
Conv + BatchNorm + Scale
layers to 1 layer when those layer are freezed to reduce memory - support snapshot after got -SIGTERM (kill command's default signal)
- Faster rcnn joint train, test and evaluate
- Action recognition (Two Stream CNN)
Special layers
- ROIAlign proposed in Mask R-CNN
- FocalLoss in Focal Loss for Dense Object Detection
- Swish Activation function in Searching for Activation Functions
- eltwise layer using in-place sum to reduce memory, from this PR
Data Preprocess
- support Histogram equalization of color image
- random flip horizontal
TODO list
- support batch image greater than 1
- support Rotated R-CNN for rotated bounding box
- OHEM
- caffe layer module
The official Faster R-CNN code (written in MATLAB) is available here. If your goal is to reproduce the results in our NIPS 2015 paper, please use the official code.
This repository contains a C++ reimplementation of the Python code(py-faster-rcnn). This C++ implementation is built on the offcial caffe, I will continue to update this code for improvement and up-to-date by offcial caffe.
All following steps, you should do these in the $CAFFE_ROOT
path.
Using sh example/FRCNN/demo_frcnn.sh
, the will process five pictures in the examples/FRCNN/images
, and put results into examples/FRCNN/results
.
Note: You should prepare the trained caffemodel into models/FRCNN
, such as ZF_faster_rcnn_final.caffemodel
for ZF model.
- The list of training data is
examples/FRCNN/dataset/voc2007.trainval
. - The list of testing data is
examples/FRCNN/dataset/voc2007.trainval
. - Create symlinks for the PASCAL VOC dataset
ln -s $YOUR_VOCdevkit_Path $CAFFE_ROOT/VOCdevkit
.
As shown in VGG example models/FRCNN/vgg16/train_val.proto
, the original pictures should appear at $CAFFE_ROOT/VOCdevkit/VOC2007/JPEGImages/
. (Check window_data_param in FrcnnRoiData)
If you want to train Faster R-CNN on your own dataset, you may prepare custom dataset list. The format is as below
# image-id
image-name
number of boxes
label x1 y1 x2 y2 difficulty
...
sh examples/FRCNN/zf/train_frcnn.sh
will start training process of voc2007 data using ZF model.
The ImageNet pre-trained models can be found in this link
If you use the provided training script, please make sure:
- VOCdevkit is within $CAFFE_ROOT and VOC2007 in within VOCdevkit
- ZF pretrain model should be put into models/FRCNN/ as ZF.v2.caffemodel
examples/FRCNN/convert_model.py
transform the parameters of bbox_pred
layer by mean and stds values,
because the regression value is normalized during training and we should recover it to obtain the final model.
sh examples/FRCNN/zf/test_frcnn.sh
the will evaluate the performance of voc2007 test data using the trained ZF model.
- First Step of This Shell : Test all voc-2007-test images and output results in a text file.
- Second Step of This Shell : Compare the results with the ground truth file and calculate the mAP.
Shells and prototxts for different models are listed in the examples/FRCNN
and models/FRCNN
More details in the code:
include/api/FRCNN
andsrc/api/FRCNN
for demo and test apiinclude/caffe/FRCNN
andsrc/caffe/FRCNN
contains all codes related to Faster R-CNN
For synchronous with official caffe
- git remote add caffe https://github.com/BVLC/caffe.git
- git fetch caffe
- git checkout master
- git rebase caffe/master
Rebase the dev branch
- git checkout dev
- git rebase master
- git push -f origin dev
- CUB not found, when compile for GPU version,
frcnn_proposal_layer.cu
requires a head file<cub/cub.cuh>
. CUB is library contained in the official Cuda Toolkit, usually can be found in/usr/local/cuda/include/thrust/system/cuda/detail/
. You should add this path in yourMakefile.config
(trylocate ''cub.cuh''
to find cub on your system) - When Get
error: RPC failed; result=22, HTTP code = 0
, usegit config http.postBuffer 524288000
, increases git buffer to 500mb
See codes src/caffe/ACTION_REC
and include/caffe/ACTION_REC
Caffe is released under the BSD 2-Clause license. The BAIR/BVLC reference models are released for unrestricted use.
Please cite Caffe in your publications if it helps your research:
@article{jia2014caffe,
Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
Journal = {arXiv preprint arXiv:1408.5093},
Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},
Year = {2014}
}
@inproceedings{girshick2015fast,
title={Fast R-CNN},
author={Girshick, Ross},
booktitle={International Conference on Computer Vision},
pages={1440--1448},
year={2015}
}
@inproceedings{ren2015faster,
title={Faster {R-CNN}: Towards real-time object detection with region proposal networks},
author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
booktitle={Neural Information Processing Systems},
pages={91--99},
year={2015}
}
@article{ren2017faster,
title={Faster {R-CNN}: Towards real-time object detection with region proposal networks},
author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume={39},
number={6},
pages={1137--1149},
year={2017},
publisher={IEEE}
}