MMBench & Newt

Official code repository for the paper

Learning Massively Multitask World Models for Continuous Control

Nicklas Hansen, Hao Su*, Xiaolong Wang* (UC San Diego)

Early access (Nov 2025): This is an early code release; we will continue to add features and code improvements in the coming months, but wanted to make the code available to the public as soon as possible. Please let us know if you have any questions or issues by opening an issue on GitHub!

MMBench

MMBench contains a total of 200 unique continuous control tasks for training of massively multitask RL policies. The task suite consists of 159 existing tasks proposed in previous work, 22 new tasks and task variants for these existing domains, as well as 19 entirely new arcade-style tasks that we dub MiniArcade. MMBench tasks span multiple domains and embodiments, and each task comes with language instructions, demonstrations, and optionally image observations, enabling research on both multitask pretraining, offline-to-online RL, and RL from scratch.

Newt

Newt is a language-conditioned multitask world model based on TD-MPC2. We train Newt by first pretraining on demonstrations to acquire task-aware representations and action priors, and then jointly optimizing with online interaction across all tasks. To extend TD-MPC2 to the massively multitask online setting, we propose a series of algorithmic improvements including a refined architecture, model-based pretraining on the available demonstrations, additional action supervision in RL policy updates, and a drastically accelerated training pipeline.

Getting started

We provide a Dockerfile for easy installation. You can build the docker image by running

cd docker && docker build . -t <user>/newt:1.0.0

This docker image contains all dependencies needed for running MMBench and Newt.

Example usage

Agents can trained by running the train.py script. Below are some example commands:

$ python train.py    # <-- a 20M parameter agent trained on all 200 MMBench tasks
$ python train.py model_size=XL    # <-- a 80M parameter agent
$ python train.py model_size=B task=walker-walk   # <-- a 5M parameter single-task agent
$ python train.py obs=rgb    # <-- a 20M parameter agent trained with state+RGB observations

We recommend using default hyperparameters, including the default model size of 20M parameters (model_size=L). See config.py for a full list of arguments.

Citation

If you find our work useful, please consider citing our paper as follows:

@misc{Hansen2025Newt,
	title={Learning Massively Multitask World Models for Continuous Control}, 
	author={Nicklas Hansen and Hao Su and Xiaolong Wang},
	booktitle={Preprint},
	url={https://www.nicklashansen.com/NewtWM},
	year={2025}
}

Contributing

You are very welcome to contribute to this project, but please understand that we will not be able to respond to any pull requests or issues while the submission is under review. Feel free to open an issue or pull request if you have any suggestions or bug reports, but please review our guidelines first.

License

This project is licensed under the MIT License - see the LICENSE file for details. Note that the repository relies on third-party code, which is subject to their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
docker		docker
tdmpc2		tdmpc2
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
tasks.json		tasks.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MMBench & Newt

MMBench

Newt

Getting started

Example usage

Citation

Contributing

License

About

Uh oh!

Releases

Packages

Languages

nicklashansen/newt

Folders and files

Latest commit

History

Repository files navigation

MMBench & Newt

MMBench

Newt

Getting started

Example usage

Citation

Contributing

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages