Skip to content

feat(envs): add LIBERO-plus robustness benchmark#3313

Open
pkooij wants to merge 1 commit intomainfrom
feat/libero-plus-benchmark
Open

feat(envs): add LIBERO-plus robustness benchmark#3313
pkooij wants to merge 1 commit intomainfrom
feat/libero-plus-benchmark

Conversation

@pkooij
Copy link
Copy Markdown
Member

@pkooij pkooij commented Apr 8, 2026

Title

feat(envs): add LIBERO-plus robustness benchmark

Type / Scope

  • Type: Feature
  • Scope: src/lerobot/envs/, pyproject.toml, docker/, docs/

Summary / Motivation

LIBERO-plus is a robustness benchmark for VLA models that extends LIBERO with 7 perturbation dimensions (camera viewpoints, object layouts, robot initial states, language instructions, lighting, background textures, sensor noise), producing ~10 000 task variants across the standard LIBERO suites.

Because LIBERO-plus keeps the same Python gym interface as the original LIBERO, the integration is minimal: a one-line config subclass, an import fallback for the different package nesting, and a new pip extras group.

Related issues

  • Related: LIBERO integration (already in feat/async-vector-env)

What changed

  • src/lerobot/envs/libero.py — wraps the top-level LIBERO imports in try/except to handle the extra module nesting level that LIBERO-plus ships with
  • src/lerobot/envs/configs.py — adds LiberoPlusEnv config (@EnvConfig.register_subclass("libero_plus")), a thin subclass of LiberoEnv with task="libero_spatial" as default; fully inherits create_envs and get_env_processors
  • pyproject.toml — adds libero_plus optional dep group and includes it in all
  • docs/source/libero_plus.mdx — new benchmark doc: perturbation dimensions, task suites, install instructions, eval commands, camera name mapping, dataset reference
  • docs/source/_toctree.yml — registers new doc page
  • docker/Dockerfile.benchmark.libero_plus — isolated CI image (adds libexpat1 libfontconfig1-dev libmagickwand-dev system deps required by LIBERO-plus)
  • .github/workflows/benchmark_tests.yml — adds libero-plus-integration-test job (build image + 1-episode smoke eval on aws-g6-4xlarge-plus)

No breaking changes. env.type=libero continues to work unchanged.

Dataset note

pepijn223/libero_plus_lerobot is already LeRobot v3.0 format — no conversion needed.
Dataset card (README) is missing on the Hub and should be added in a follow-up.
Camera keys: observation.images.front / observation.images.wrist.

How was this tested (or how to run locally)

  • pre-commit run -a passes on all changed files
  • Registration verified locally via PYTHONPATH override:
    from lerobot.envs.configs import EnvConfig
    cfg = EnvConfig.get_choice_class('libero_plus')()
    # → type=libero_plus, task=libero_spatial
    
  • Full eval smoke-test (requires Linux + GPU + LIBERO-plus installed):
    lerobot-eval \
      --policy.path=pepijn223/smolvla_libero \
      --env.type=libero_plus \
      --env.task=libero_spatial \
      --eval.batch_size=1 --eval.n_episodes=1 \
      --eval.use_async_envs=false --policy.device=cuda \
      '--env.camera_name_mapping={"agentview_image": "camera1", "robot0_eye_in_hand_image": "camera2"}' \
      --policy.empty_cameras=1
    Runs automatically via the new CI job on aws-g6-4xlarge-plus.

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest) — LIBERO-plus requires Linux, validated via CI
  • Documentation updated (docs/source/libero_plus.mdx)
  • CI is green

Reviewer notes

  • The import fallback in libero.py is the only change touching existing code paths. The try branch runs for hf-libero; the except branch for LIBERO-plus. Transparent to callers.
  • LiberoPlusEnv is intentionally a minimal subclass — no duplicated logic.
  • The Docker image uses uv sync --extra libero_plus --no-cache (no --locked) because the GitHub-sourced package is not in uv.lock. Pin a commit SHA in the dep once LIBERO-plus stabilizes.

@pkooij pkooij force-pushed the feat/async-vector-env branch 2 times, most recently from 35f18d4 to 566a77b Compare April 8, 2026 17:05
@pkooij pkooij changed the base branch from feat/async-vector-env to feat/benchmark-ci April 9, 2026 08:04
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch 4 times, most recently from 7199137 to 9fe9008 Compare April 13, 2026 15:49
Base automatically changed from feat/benchmark-ci to main April 13, 2026 19:24
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch from 1c82840 to 5df5008 Compare April 14, 2026 08:47
@github-actions github-actions bot added documentation Improvements or fixes to the project’s docs tests Problems with test coverage, failures, or improvements to testing CI Issues related to the continuous integration pipeline evaluation For issues or PRs related to environment evaluation, and benchmarks. github_actions labels Apr 14, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch from 5df5008 to 0fa55f5 Compare April 14, 2026 09:09
@github-actions github-actions bot removed the tests Problems with test coverage, failures, or improvements to testing label Apr 14, 2026
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch 2 times, most recently from 6562146 to 542fb80 Compare April 14, 2026 11:17
- LiberoPlusEnv config (subclass of LiberoEnv, same gym interface)
- Docker image installing LIBERO-plus fork via PYTHONPATH
- CI workflow: 1-episode smoke eval with pepijn223/smolvla_libero_plus
- pyproject.toml: libero_plus extra

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch from 842a948 to dab97dc Compare April 14, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Issues related to the continuous integration pipeline documentation Improvements or fixes to the project’s docs evaluation For issues or PRs related to environment evaluation, and benchmarks. github_actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants