RDD is an retrieval-based visual demonstration decomposer that automatically identifies sub-tasks visually similar to a set of existing expert-labeled sub-tasks.
Mingxuan Yan1, Yuping Wang1,2, Zechun Liu3, Jiachen Li1
1 University of California, Riverside
2 University of Michigan
3 Meta AI
- Sub-task Discovery with Prior: Different to non-prior heuristic sub-task discovery algorithms such as UVD, RDD identifies sub-tasks that are visually similar to ones in a given expert labeled sub-task dataset. This is specially useful when generating additional sub-tasks for fine-tuning or data augmentation, which encourages the policy to reuse learned skills from the original dataset.
- Planner-visuomotor Alignment: (Youtube Video) In hierarchical VLAs, the planner, often a powerful VLM, performs task planning and reasoning to break down complex tasks into simpler sub-tasks with step-by-step language instructions. Conditioned on the generated sub-task instructions, a learning-based visuomotor policy, trained on datasets with short-horizon sub-tasks, performs precise manipulation to complete the sub-tasks one by one, thereby completing long-horizon tasks. RDD automatically decomposes demonstrations into sub-tasks by aligning the visual features of the decomposed sub-task intervals with those from the training data of the low-level visuomotor policies.

RDD formulates demonstration decomposition as an optimal partitioning problem, using retrieval with approximate nearest neighbor search (ANNS) and dynamic programming to efficiently find the optimal decomposition strategy.
Set up python environment:
conda create -n rdd python==3.9 -y && conda activate rdd
./scripts/setup/setup_rdd_env.sh
This default installation only supports encoders LIV, ClIP, ResNet. To install full support for other encoders ( VIP , R3M , VC-1) please follow setup_rdd_env.sh.
See kitchen_demo.md
See franka_demo.md
RDD allows you to define your own sub-task prior.
In RDD the sub-task is represented by the feature of of ending frame / starting frame. You can define your own sub-task feature by modifying sub-task feature generation class subtask_embeds_to_feature in rdd/embed.py
class subtask_embeds_to_feature(object):
@staticmethod
def feature_dim(embed_dim: int, mode: str) -> int:
<your feature dim>
def __call__(self, embeds: Union[np.ndarray, torch.Tensor], mode: str) -> np.ndarray:
"""
input: embeds (L, N)
output: feature (2N,) or (N,) if mode=='ood'
"""
<your implementation>You can also define a completely custimized sub-task prior score by modifying rdd_score in rdd/algorithms.py
def rdd_score(
subarray: List[int],
searcher: AnnoySearcher,
embeds: np.ndarray,
...
):
<your implementation>- Release the core algorithm and demo scripts on AgiBotWorld & RoboCerebra. (ETA: by the end of Oct.2025).
- Release use case scripts on hiearchical VLA (RACER). (ETA: expect delay, by the end of Dec.2025)
If you find this work useful, please consider citing our paper:
@inproceedings{yan2025rdd,
title={RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks},
author={Yan, Mingxuan and Wang, Yuping and Liu, Zechun and Li, Jiachen},
booktitle={Proceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS)},
year={2025},
}