Skip to content

Commit 7eedc31

Browse files
committed
update READMEs
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>
1 parent 4285cd9 commit 7eedc31

File tree

3 files changed

+12
-10
lines changed

3 files changed

+12
-10
lines changed

training/tensor_parallel/README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
1-
# AutoTP training examples
1+
# AutoTP Training Examples
2+
23
This folder groups AutoTP training examples at different complexity levels.
34

45
## Contents
5-
- `basic_example/`: minimal AutoTP + ZeRO-2 example with synthetic tokens. It also shows that AutoTP recognizes typical parameter patterns and automatically applies proper partitioning.
6-
- `hf_integration/`: Hugging Face Trainer example (adapted from Stanford Alpaca).
7-
- `custom_patterns/`: AutoTP example with custom layer patterns and a simple
6+
- [Basic example](basic_example): minimal AutoTP + ZeRO-2 example with synthetic tokens. It also shows that AutoTP recognizes typical parameter patterns and automatically applies proper partitioning.
7+
- [HuggingFace integration](hf_integration): Hugging Face Trainer example (adapted from Stanford Alpaca).
8+
- [Custom partitioning patterns](custom_patterns): AutoTP example with custom layer patterns and a simple
89
text dataset that uses a DP-rank random sampler. It shows how to define
910
parameter partitioning easily for custom models with non-standard parameter
1011
definitions.
1112

1213
## Related references
13-
- AutoTP training docs: https://github.com/deepspeedai/DeepSpeed/blob/master/docs/code-docs/source/training.rst
14-
- AutoTP training tutorial: https://github.com/deepspeedai/DeepSpeed/blob/master/docs/_tutorials/autotp-training.md
14+
- [AutoTP training docs](https://deepspeed.readthedocs.io/en/latest/training.html)
15+
- [AutoTP training tutorial](https://github.com/deepspeedai/DeepSpeed/blob/master/docs/_tutorials/autotp-training.md)

training/tensor_parallel/custom_patterns/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
# AutoTP custom patterns example
1+
# AutoTP (Tensor Parallel) Custom Patterns Example
2+
23
This example extends the minimal AutoTP script with:
34

45
- custom layer sharding patterns (`partition_config`)
@@ -10,6 +11,7 @@ AutoTP is enabled by the DeepSpeed config (`tensor_parallel.autotp_size`), so
1011
you do not need to call any initialization helpers before `deepspeed.initialize`.
1112

1213
## Key code (custom patterns)
14+
1315
The config below targets **Pythia 6.9B (GPT-NeoX)**, which uses a fused
1416
`query_key_value` projection. We provide a `shape` so AutoTP can split the
1517
fused Q/K/V tensor cleanly across tensor-parallel ranks. The MLP uses
Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
1-
# tensor parallel example (Hugging Face Trainer + AutoTP)
1+
# AutoTP (Tensor Parallel) HuggingFace Integration Example
2+
23
This project is adapted from https://github.com/tatsu-lab/stanford_alpaca.
34
It uses Hugging Face `Trainer` with a DeepSpeed config that enables AutoTP via `tensor_parallel.autotp_size`.
45
We only modified the DeepSpeed config and logging, as an example use case.
56

67
**Script**
78

89
``` bash run.sh ``` or ```bash run.sh MODE```
9-
10-

0 commit comments

Comments
 (0)