Skip to content

Commit

Permalink
Port multi host gpu training instructions.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 303779613
  • Loading branch information
saberkun authored and tensorflower-gardener committed Mar 30, 2020
1 parent fc02382 commit 70a3d96
Showing 1 changed file with 16 additions and 2 deletions.
18 changes: 16 additions & 2 deletions official/vision/image_classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,25 @@ provide a few options.
Note: These models will **not** work with TPUs on Colab.

You can train image classification models on Cloud TPUs using
`tf.distribute.TPUStrategy`. If you are not familiar with Cloud TPUs, it is
strongly recommended that you go through the
[tf.distribute.experimental.TPUStrategy](https://www.tensorflow.org/api_docs/python/tf/distribute/experimental/TPUStrategy?version=nightly).
If you are not familiar with Cloud TPUs, it is strongly recommended that you go
through the
[quickstart](https://cloud.google.com/tpu/docs/quickstart) to learn how to
create a TPU and GCE VM.

### Running on multiple GPU hosts

You can also train these models on multiple hosts, each with GPUs, using
[tf.distribute.experimental.MultiWorkerMirroredStrategy](https://www.tensorflow.org/api_docs/python/tf/distribute/experimental/MultiWorkerMirroredStrategy).

The easiest way to run multi-host benchmarks is to set the
[`TF_CONFIG`](https://www.tensorflow.org/guide/distributed_training#TF_CONFIG)
appropriately at each host. e.g., to run using `MultiWorkerMirroredStrategy` on
2 hosts, the `cluster` in `TF_CONFIG` should have 2 `host:port` entries, and
host `i` should have the `task` in `TF_CONFIG` set to `{"type": "worker",
"index": i}`. `MultiWorkerMirroredStrategy` will automatically use all the
available GPUs at each host.

## MNIST

To download the data and run the MNIST sample model locally for the first time,
Expand Down

0 comments on commit 70a3d96

Please sign in to comment.