Chi Zhang vermouth1992

🎯

Focusing

Applied Machine Learning/Machine Learning Systems

166 followers · 17 following

USC -> ByteDance
LA -> Shanghai
https://vermouth1992.github.io

Achievements

x2 x2

Achievements

x2 x2

Pinned Loading

volcengine/verl Public

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6.6k 707
drl-portfolio-management Public archive

CSCI 599 deep learning and its applications final project

Jupyter Notebook 152 71
synthetic-time-series-smart-grid Public archive

Synthetic Time Series Generation using Generative Adversarial Network

Python 10 7
mbrl-hvac Public

Model-based Reinforcement Learning for Building HVAC Control

Jupyter Notebook 35 6
bracp Public

Improved Behavior Regularized Offline Reinforcement Learning

Jupyter Notebook 5 2
rlutils-python Public

Python 1

321 contributions in the last year

Learn how we count contributions

Less

April 2025

Created 2 commits in 1 repository

volcengine/verl 2 commits

Opened 2 pull requests in 1 repository

volcengine/verl 2 merged

[logger] fix: fix mlflow
This contribution was made on Apr 14
[megatron] feat: optimize entropy loss
This contribution was made on Apr 10

Reviewed 32 pull requests in 1 repository

volcengine/verl 25 pull requests

doc: upgrade to vllm 0.8.3
This contribution was made on Apr 14
fix time for dapo
This contribution was made on Apr 14
mcore readme
This contribution was made on Apr 14
fix: replace '@' with '_at_' in metric names to comply with MLflow naming constraints
This contribution was made on Apr 14
Update vllm 0.8.2 with megatron 0.11.0
This contribution was made on Apr 14
Fix megatron default config
This contribution was made on Apr 13
fix checkpoint rng_states confliction
This contribution was made on Apr 13
fix: Megatron_workers batch_size config is not processed correctly
This contribution was made on Apr 13
reset default tp size
This contribution was made on Apr 12
tests: add import utils tests
This contribution was made on Apr 12
[mcore] option to use dist checkpoint
This contribution was made on Apr 11
fix: use packaging to compre versions instead of str comparing
This contribution was made on Apr 11
Support fsdp2 for fsdp_worker
This contribution was made on Apr 11
[sglang] docs: fix README index
This contribution was made on Apr 11
Change behaviour during raw prompt extraction
This contribution was made on Apr 10
docs: update recent talks
This contribution was made on Apr 10
fix: wrong pg_clipfrac_lower
This contribution was made on Apr 8
docs: add open-hands, vagen
This contribution was made on Apr 8
fix: optim.warmup_style do not take effect (#418)
This contribution was made on Apr 8
fix: support non-DTensor when converting fsdp checkpoints to hf model
This contribution was made on Apr 7
[algo] misc: remove redundant tile([1, response_length]), efficient broadcast instead
This contribution was made on Apr 4
Support REINFORCE++-baseline and add script for REINFORCE++
This contribution was made on Apr 4
fix: the error is not raised when using both megatron and hf inference
This contribution was made on Apr 3
docs: add config docs for evaluation.yaml
This contribution was made on Apr 3
fix: misleading eos_mask->response_mask
This contribution was made on Apr 3
Some pull request reviews not shown.

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chi Zhang vermouth1992

Achievements

Achievements

Block or report vermouth1992

Pinned Loading

321 contributions in the last year

Contribution activity

April 2025

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat