diff --git a/examples/auto_parallel/README.md b/examples/auto_parallel/README.md index 0564c1257..b04f68b4f 100644 --- a/examples/auto_parallel/README.md +++ b/examples/auto_parallel/README.md @@ -17,6 +17,8 @@ The CUDA driver on your machine should be ‌≥525.60.13, and the CUDA toolkit ## Runtime Environment Preparation `mpirun python -m pip install -r requirements.txt --force-reinstall` +Note: paddlepaddle-gpu version requirement: 3.2.0 or later. [install Paddle](https://www.paddlepaddle.org.cn/install/quick?docurl=undefined) + ## Start Pre-Training After the environment is ready, pre-training on 56 GPUs can be launched by: `mpirun bash train_4p5_300B_A47B.sh`, @@ -26,3 +28,9 @@ should be replaced according to the real environment. The toolkit provides an auto-parallel solution for ERNIE-4.5 pre-training, including the hybrid parallelism training strategy. More advanced optimizations are on the way. + + +Currently, the auto-parallel intermediate API has some limitations under ongoing development: + +- Limited support for MOE +- Limited support for VPP in pipeline parallelism (default USE_VPP=0 in scripts; when USE_VPP=1, basic API are used for modeling) diff --git a/examples/auto_parallel/README_zh.md b/examples/auto_parallel/README_zh.md index a86948690..477bbd4ab 100644 --- a/examples/auto_parallel/README_zh.md +++ b/examples/auto_parallel/README_zh.md @@ -17,6 +17,8 @@ ## 环境准备 `mpirun python -m pip install -r requirements.txt --force-reinstall` +注意:paddlepaddle-gpu 需要使用 3.2 版本,安装可使用[参考](https://www.paddlepaddle.org.cn/install/quick?docurl=undefined) + ## 开始训练 在准备好环境后。您可以通过执行以下命令来进行56卡预训练: `mpirun bash train_4p5_300B_A47B.sh`, @@ -24,3 +26,7 @@ - 注意,您需要将 `train_4p5_300B_A47B.sh` 中的 `master_ip` 与 `port` 根据您的环境进行替换。 该工具包提供了使用自动并行完成 ERNIE-4.5 预训练的方法,包括多维混合并行训练策略,更多的优化点和功能会基于此版本持续更新。 + +现在自动并行中层API存在一些局限性,正在进一步支持: +- 对 MOE 的支持不完备 +- 对流水线并行中的 VPP 优化支持不完备(脚本中默认 USE_VPP=0;当设置 USE_VPP=1 时,采用基础API完成组网) diff --git a/examples/auto_parallel/requirements.txt b/examples/auto_parallel/requirements.txt index b8214d254..e7a68b28c 100644 --- a/examples/auto_parallel/requirements.txt +++ b/examples/auto_parallel/requirements.txt @@ -1,2 +1,5 @@ paddlepaddle-gpu -paddleformers +paddleformers>=0.2.0 +tensorboardX>=2.6.4 +decord>=0.6.0 +moviepy>=2.2.1