Description
- [ v ] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [ v ] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [ v ] I checked to make sure that this issue has not already been filed.
1. The entire URL of the file you are using
http://download.tensorflow.org/models/object_detection/tf2/20200713/centernet_hg104_512x512_coco17_tpu-8.tar.gz
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d2_coco17_tpu-32.tar.gz
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
2. Describe the bug
When training centernet model with TF object detection API, CPU and Ram usage is very high while GPU usage is basically 0%.
However, this doesn't happen when training efficientdet_d2 and ssd_resnet50 on the same dataset, where CPU, RAM and GPU are all used: see screenshots below.
(Note that the models are being trained on the same image dataset)
3. Steps to reproduce
Train the centernet model from TF OD API with the following pipeline.config file:
model {
center_net {
num_classes: 1
feature_extractor {
type: "hourglass_104"
channel_means: 104.01361846923828
channel_means: 114.03422546386719
channel_means: 119.91659545898438
channel_stds: 73.60276794433594
channel_stds: 69.89082336425781
channel_stds: 70.91507720947266
bgr_ordering: true
}
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 512
max_dimension: 512
pad_to_max_dimension: true
}
}
object_detection_task {
task_loss_weight: 1.0
offset_loss_weight: 1.0
scale_loss_weight: 0.10000000149011612
localization_loss {
l1_localization_loss {
}
}
}
object_center_params {
object_center_loss_weight: 1.0
classification_loss {
penalty_reduced_logistic_focal_loss {
alpha: 2.0
beta: 4.0
}
}
min_box_overlap_iou: 0.6
max_box_predictions: 50
}
}
}
train_config {
batch_size: 2
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
random_crop_image {
min_aspect_ratio: 0.5
max_aspect_ratio: 1.7000000476837158
random_coef: 0.25
}
}
data_augmentation_options {
random_adjust_hue {
}
}
data_augmentation_options {
random_adjust_contrast {
}
}
data_augmentation_options {
random_adjust_saturation {
}
}
data_augmentation_options {
random_adjust_brightness {
}
}
data_augmentation_options {
random_absolute_pad_image {
max_height_padding: 200
max_width_padding: 200
pad_color: 0.0
pad_color: 0.0
pad_color: 0.0
}
}
optimizer {
adam_optimizer {
learning_rate {
manual_step_learning_rate {
initial_learning_rate: 0.0010000000474974513
schedule {
step: 1000
learning_rate: 9.999999747378752e-05
}
schedule {
step: 5000
learning_rate: 9.999999747378752e-06
}
}
}
epsilon: 1.0000000116860974e-07
}
use_moving_average: false
}
fine_tune_checkpoint: "pre-trained-models/centernet_hg104_512x512_coco17_tpu-8/checkpoint/ckpt-0"
num_steps: 5000
max_number_of_boxes: 50
unpad_groundtruth_tensors: false
fine_tune_checkpoint_type: "detection"
fine_tune_checkpoint_version: V2
}
train_input_reader {
label_map_path: "annotations/label_map.pbtxt"
tf_record_input_reader {
input_path: "annotations/train.record"
}
}
eval_config {
metrics_set: "coco_detection_metrics"
use_moving_averages: false
batch_size: 1
}
eval_input_reader {
label_map_path: "annotations/label_map.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "annotations/test.record"
}
}
4. Expected behavior
I would have expected to see reasonably high GPU usage in the centernet training as well.
5. Additional context
Include any logs that would be helpful to diagnose the problem.
6. System information
Windows 10
CPU: i9-10980HK
ram: 32GB
GPU: GTX3080 8GB dedicated memory
tensorflow = 2.5
CUDA = 11.3.1
cuDNN = 8.2.1.32