Skip to content

Commit d5511d2

Browse files
authored
easyr1 verl val_generations_to_log (#121)
* update * update verl
1 parent 82dcdb1 commit d5511d2

File tree

4 files changed

+58
-3
lines changed

4 files changed

+58
-3
lines changed

en/guide_cloud/integration/integration-easyr1.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,19 @@ In the `EasyR1` directory, execute the following command to train the Qwen2.5-VL
6262
bash examples/run_qwen2_5_vl_7b_geo_swanlab.sh
6363
```
6464

65+
## 4. Record Generated Text During Each Evaluation Round
66+
67+
If you want to log the generated text to SwanLab during each evaluation round (`val`), simply add the line `val_generations_to_log=1` in the command:
68+
69+
```bash {6}
70+
python3 -m verl.trainer.main \
71+
config=examples/grpo_example.yaml \
72+
worker.actor.model.model_path=${MODEL_PATH} \
73+
trainer.logger=['console','swanlab'] \
74+
trainer.n_gpus_per_node=4 \
75+
val_generations_to_log=1
76+
```
77+
6578
## Final Remarks
6679

6780
EasyR1 is a new open-source project by [hiyouga](https://github.com/hiyouga), the author of [LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory]), a reinforcement learning framework for multimodal large models. We thank [hiyouga](https://github.com/hiyouga) for his contributions to the global open-source ecosystem, and SwanLab will continue to accompany AI developers.

en/guide_cloud/integration/integration-verl.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,4 +120,18 @@ swanlab watch
120120
121121
For more details, refer to [SwanLab Offline Dashboard Mode](https://docs.swanlab.cn/guide_cloud/self_host/offline-board.html).
122122
123-
To set the port number on the server, refer to [Offline Dashboard Port Number](https://docs.swanlab.cn/api/cli-swanlab-watch.html#%E8%AE%BE%E7%BD%AEip%E5%92%8C%E7%AB%AF%E5%8F%A3%E5%8F%B7).
123+
To set the port number on the server, refer to [Offline Dashboard Port Number](https://docs.swanlab.cn/api/cli-swanlab-watch.html#%E8%AE%BE%E7%BD%AEip%E5%92%8C%E7%AB%AF%E5%8F%A3%E5%8F%B7).
124+
125+
126+
## Record Generated Text During Each Evaluation Round
127+
128+
If you wish to log the generated text to SwanLab during each evaluation round (`val`), simply add the line `val_generations_to_log_to_wandb=1` in the command:
129+
130+
```bash {5}
131+
PYTHONUNBUFFERED=1 python3 -m verl.trainer.main_ppo \
132+
data.train_files=$HOME/data/gsm8k/train.parquet \
133+
data.val_files=$HOME/data/gsm8k/test.parquet \
134+
trainer.logger=['console','swanlab'] \
135+
val_generations_to_log_to_wandb=1 \
136+
...
137+
```

zh/guide_cloud/integration/integration-easyr1.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ bash examples/run_qwen2_5_7b_math_swanlab.sh
3838

3939
当然,这里我们可以剖析一下,由于EasyR1是原始 veRL 项目的一个干净分叉,所以继承了[veRL与SwanLab的集成](/guide_cloud/integration/integration-verl.md)。所以这里我们来看`run_qwen2_5_7b_math_swanlab.sh`文件:
4040

41-
```sh
41+
```sh {10}
4242
set -x
4343

4444
export VLLM_ATTENTION_BACKEND=XFORMERS
@@ -48,7 +48,7 @@ MODEL_PATH=Qwen/Qwen2.5-7B-Instruct # replace it with your local file path
4848
python3 -m verl.trainer.main \
4949
config=examples/grpo_example.yaml \
5050
worker.actor.model.model_path=${MODEL_PATH} \
51-
trainer.logger=['console','swanlab'] \ # [!code ++]
51+
trainer.logger=['console','swanlab'] \
5252
trainer.n_gpus_per_node=4
5353
```
5454

@@ -62,6 +62,21 @@ python3 -m verl.trainer.main \
6262
bash examples/run_qwen2_5_vl_7b_geo_swanlab.sh
6363
```
6464

65+
## 4. 每轮评估时记录生成文本
66+
67+
如果你希望在每轮评估(val)时将生成的文本记录到SwanLab中,只需在命令行钟增加一行`val_generations_to_log=1`即可:
68+
69+
```bash {6}
70+
python3 -m verl.trainer.main \
71+
config=examples/grpo_example.yaml \
72+
worker.actor.model.model_path=${MODEL_PATH} \
73+
trainer.logger=['console','swanlab'] \
74+
trainer.n_gpus_per_node=4 \
75+
val_generations_to_log=1
76+
```
77+
78+
79+
6580
## 写在最后
6681

6782
EasyR1 是 [LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory) 作者 [hiyouga](https://github.com/hiyouga) 的全新开源项目,一个适用于多模态大模型的强化学习框架。感谢 [hiyouga](https://github.com/hiyouga) 为全球开源生态的贡献,SwanLab也将继续与AI开发者同行。

zh/guide_cloud/integration/integration-verl.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,3 +123,16 @@ swanlab watch
123123
更多详细可以参考[SwanLab离线看板模式](https://docs.swanlab.cn/guide_cloud/self_host/offline-board.html)
124124

125125
服务器设置端口号可以查看[离线看板端口号](https://docs.swanlab.cn/api/cli-swanlab-watch.html#%E8%AE%BE%E7%BD%AEip%E5%92%8C%E7%AB%AF%E5%8F%A3%E5%8F%B7)
126+
127+
## 每轮评估时记录生成文本
128+
129+
如果你希望在每轮评估(val)时将生成的文本记录到SwanLab中,只需在命令行钟增加一行`val_generations_to_log_to_wandb=1`即可:
130+
131+
```bash {5}
132+
PYTHONUNBUFFERED=1 python3 -m verl.trainer.main_ppo \
133+
data.train_files=$HOME/data/gsm8k/train.parquet \
134+
data.val_files=$HOME/data/gsm8k/test.parquet \
135+
trainer.logger=['console','swanlab'] \
136+
val_generations_to_log_to_wandb=1 \
137+
...
138+
```

0 commit comments

Comments
 (0)