Skip to content

Tensorboard eventfile is very large(80Gb) while training with effecientdet D0 Β #9052

Open
@Dhivya-rav

Description

@Dhivya-rav

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am using the latest TensorFlow Model Garden release and TensorFlow 2.
  • I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

...
https://github.com/tensorflow/models/blob/da23acba8ecb8c0e7c9a83cdb9f10092895c9dcc/research/object_detection/model_main_tf2.py

2. Describe the bug

I am training a single class object detection using EfficientDet D0 512*512 model from tf2 detection zoo. The log file after 200k epochs is more than 80GB and growing. The training steps set in config file is 300k.
I noticed such large log files also with SSD mobile net v2 and Faster Rcnn (10GB for 100000 steps). In comparison TF1 training log files for 300000 steps was less than 1gb.

3. Steps to reproduce

Steps to reproduce the behavior.

4. Expected behavior

A clear and concise description of what you expected to happen.

5. Additional context

Include any logs that would be helpful to diagnose the problem.

6. System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • Mobile device name if the issue happens on a mobile device:
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below):2.2
  • Python version: 3.6
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions