Skip to content

skywangxy/phasegate_act

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lerobot_policy_act_phasegate

PhaseGate-ACT packaged as a LeRobot third-party policy plugin ("Bring Your Own Policies").

PhaseGate adds a small learnable per-camera scalar gate on top of LeRobot's ACT (Action Chunking with Transformers) policy. A stochastic hard-concrete (L0) gate produces per-camera weights α_i ∈ [0, 1] from the current-frame joint state and spatial soft-argmax camera features, and re-weights each camera's visual tokens before they enter the ACT encoder — an interpretable "which view matters right now?" knob. With use_phase_gate=False the policy is bit-identical to ACT.

⚠️ Two cameras only. The current version is developed and verified for a two-camera setup (e.g. a wrist view + an overhead view). Other camera counts are not supported yet — the gate's design and the reported findings (the two-crossover phase signature) assume exactly two views.

Install

Requires a working lerobot install (it is pulled from PyPI automatically if missing).

# from GitHub (community)
pip install "git+https://github.com/skywangxy/phasegate_act.git"

# or from a local clone (development)
git clone https://github.com/skywangxy/phasegate_act.git
cd phasegate_act
pip install -e .

The GitHub repo is named phasegate_act, but the installed Python distribution is lerobot_policy_act_phasegate — that lerobot_policy_ prefix is what LeRobot's plugin discovery scans for, so it must stay.

Installing registers the act_phasegate policy type with LeRobot automatically: LeRobot's register_third_party_plugins() imports every installed distribution whose name starts with lerobot_policy_, and this package's __init__.py registers the config on import.

Verified environment

This plugin couples to LeRobot internals (it reuses ACT's modeling/processor, relies on the policy factory's configuration_modeling_ resolution, and patches LeRobotDataset for alpha saving), so it is version-sensitive. It has been verified with:

Component Verified version
lerobot 0.4.4
Python 3.10
PyTorch 2.7.1
rerun-sdk 0.26.2 (for visualization)
OS / device macOS (Apple Silicon), mps and cpu

Other versions may work but are untested; if you hit an import/resolution error after a LeRobot upgrade, pin lerobot==0.4.4.

Use

lerobot-train \
    --dataset.repo_id=folder_name/dataset_name \
    --policy.type=act_phasegate \
    --policy.use_phase_gate=true \
    --policy.device=mps \
    --output_dir=outputs/train/act_phasegate

Or from Python:

from lerobot_policy_act_phasegate import ACTPhaseGateConfig, ACTPhaseGatePolicy  # noqa: F401
from lerobot.configs.policies import PreTrainedConfig

assert "act_phasegate" in PreTrainedConfig.get_known_choices()

Recording with alpha + visualizing it

Recording with an act_phasegate checkpoint via the official lerobot-record automatically writes a per-frame alpha sidecar (<dataset_folder>/<dataset_name>.npz) next to the dataset — no extra flags, no edits to lerobot. With other policies lerobot-record is unchanged.

Replace the ports, ids, camera indices, dataset repo id, and checkpoint path with your own. --display_data=true additionally streams phase_gate/alpha_<cam> live in rerun alongside the joints.

lerobot-record \
    --robot.type=so101_follower \
    --robot.port=/dev/tty.usbmodemXXXXXXX \
    --robot.id=my_awesome_follower_arm \
    --robot.cameras="{ wrist: {type: opencv, index_or_path: 1, width: 640, height: 480, fps: 30, fourcc: MJPG}, top: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30, fourcc: MJPG}}" \
    --teleop.type=so101_leader \
    --teleop.port=/dev/tty.usbmodemYYYYYYY \
    --teleop.id=my_awesome_leader_arm \
    --display_data=true \
    --dataset.repo_id=folder_name/dataset_name \
    --dataset.num_episodes=50 \
    --dataset.single_task="block pick and place" \
    --dataset.episode_time_s=60 \
    --policy.path=outputs/train/act_phasegate/checkpoints/035000/pretrained_model \
    --policy.n_action_steps=1 \
    --policy.temporal_ensemble_coeff=-0.02

The package ships a companion viewer that overlays alpha read from that sidecar (it never runs a policy; plain replay if no sidecar exists):

lerobot-phasegate-dataset-viz --repo-id folder_name/dataset_name --episode-index 0

How LeRobot resolves the plugin

  • Policy class: the factory swaps configuration_modeling_ in the config class's module path and looks up ACTPhaseGatePolicy (lerobot_policy_act_phasegate.configuration_act_phasegate...modeling_act_phasegate).
  • Processor: ACTPhaseGateConfig inherits ACTConfig, so make_pre_post_processors matches the isinstance(cfg, ACTConfig) branch and reuses make_act_pre_post_processors. A name-matching make_act_phasegate_pre_post_processors is also provided as a fallback.

Layout

lerobot_policy_act_phasegate/
├── pyproject.toml
├── README.md
├── LICENSE
├── .gitignore
└── src/lerobot_policy_act_phasegate/
    ├── __init__.py                      # registers act_phasegate + installs record hooks
    ├── configuration_act_phasegate.py   # ACTPhaseGateConfig(ACTConfig)
    ├── modeling_act_phasegate.py        # ACTPhaseGatePolicy, ACTPhaseGate
    ├── phase_gate.py                     # PhaseGate module (hard-concrete L0 gate)
    ├── processor_act_phasegate.py       # make_act_phasegate_pre_post_processors
    ├── _record_hooks.py                  # lerobot-record alpha-sidecar saving (import side-effect)
    └── dataset_viz.py                    # lerobot-phasegate-dataset-viz command

Citation

If you use PhaseGate, please cite the report:

@misc{wang2026phasegate,
  title={PhaseGate: Explicit Temporal Gating for Multi-View Visual Fusion in Imitation Learning},
  author={Wang, Xiangyu (Sky)},
  year={2026},
  note={arXiv link pending}
}

PhaseGate builds on ACT (Action Chunking with Transformers):

@article{zhao2023learning,
  title={Learning fine-grained bimanual manipulation with low-cost hardware},
  author={Zhao, Tony Z and Kumar, Vikash and Levine, Sergey and Finn, Chelsea},
  journal={arXiv preprint arXiv:2304.13705},
  year={2023}
}

About

Lightweight per-camera scalar gating for multi-view robot imitation learning — plug-in module for LeRobot ACT, no stage annotations required.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages