GitHub - cairo-robotics/tabletop_workspace_opt: A ROS1 package for intent recognition and workspace optimization for shared autonomy.

Tabletop Workspace Optimization

A ROS1 package for perception-driven intent recognition and workspace layout optimization to support human-robot collaboration with a Sawyer robot.

System Requirements

Operating Systems: Ubuntu 20.04 LTS (tested)
Software Dependencies:
- Programming language: Python 3.8+
- ROS: ROS Noetic (catkin workspace)
- Major ROS packages: rospy, tf, tf2_ros, cv_bridge, message_filters, vision_msgs, sensor_msgs, geometry_msgs, visualization_msgs
- Other ROS stacks used at runtime:
  - realsense2_camera (Intel RealSense depth camera)
  - relaxed_ik_ros1 (inverse kinematics teleop/planning)
  - intera_interface (Sawyer robot SDK)
Python libraries (installed via requirements.txt): numpy, opencv-python, mediapipe (optional), ultralytics (YOLO), matplotlib, pandas, shapely, tqdm, torch, pyribs, rospkg, catkin_pkg
Hardware:
- Rethink Robotics Sawyer with electric gripper
- Intel RealSense depth camera (e.g., D435 series)
- NVIDIA GPU recommended for YOLO inference

Installation Guide

Set up ROS Noetic and a catkin workspace

sudo apt update
# Install ROS Noetic (see ROS docs if not installed)
# sudo apt install ros-noetic-desktop-full

# ROS dependencies used by this package
sudo apt install -y \
  ros-noetic-vision-msgs \
  ros-noetic-cv-bridge \
  ros-noetic-tf \
  ros-noetic-tf2-ros \
  ros-noetic-message-filters \
  ros-noetic-joy

# Optional dependencies from other stacks used in launches
sudo apt install -y ros-noetic-realsense2-camera

Clone this package into your catkin workspace and build

cd /catkin_ws
# If not already present, clone dependent stacks (examples):
# git clone https://github.com/IntelRealSense/realsense-ros src/realsense-ros
# git clone https://github.com/uwgraphics/relaxed_ik_ros1 src/relaxed_ik_ros1
# intera SDK should be installed per Sawyer docs

catkin_make
source devel/setup.bash

Create and activate a Python virtual environment, then install Python deps

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r src/tabletop_workspace_opt/requirements.txt

Notes:

ROS Python packages (e.g., rospy, cv_bridge, tf) are provided by apt/ROS, not pip.
For GPU acceleration, install the correct CUDA/cuDNN prior to installing torch.

Demo

Getting Tea and Snacks Please see assets/baseline_tea_task.mov and assets/workspace_optimized_teaa_tas.mov.

Data Tools: Includes simple tools for manual data collection and labeling.

Intent recognition and perception pipeline (camera + YOLO + intent):

source /catkin_ws/devel/setup.bash
roslaunch tabletop_workspace_opt intent_recognizer.launch

Expected Output:
- RealSense streams are started and aligned; YOLO detections are shown/republished.
- RViz displays 3D bounding boxes and markers for tracked and detected objects.
- The intent_inference node publishes ~distribution and ~top_goal reflecting user intent.
- If show_gui:=true, OpenCV windows display annotated frames and GUI controls.

If you do not have hardware available, record or play back a ROS bag with the following topics: /right_cam/color/image_raw, /right_cam/aligned_depth_to_color/image_raw, /right_cam/color/camera_info, and optional /robot/limb/right/endpoint_state.

Workspace optimization

python3 map_elites.py --config config/tea_task.yaml

The configuration files specify the task graph, objects in the scene, colors to show each object in for the plot, and the bounding box sizes of the objects. You can use movable_object_mask to specify which object positions you don't want the algorithm to modify. For those objects, you have to specify their position in object_positions for collision checks.

Expected Output:
- Object positions: [x, y, theta] printed in the console for all objects
- an image file (wo_layout_cma-me.png) of the objects in a 2D top-down view
- an image file (wo_archive_heatmap_cma-me.png) that shows the MAP-elites archive
- a pickle file of the MAP-elites archive

Instructions for Use

Configuration parameters are exposed as ROS params in the launch files and nodes:
- Perception (yolo_3d_pose_node.py): model path (~model), thresholds, class filters, frame names, GUI toggle.
- Intent (intent_inference_node.py): ~tracker_type (hand or end_effector), ~base_frame, temporal window, beta, and speed threshold.
- IK/teleop via relaxed_ik_ros1 and joystick/keyboard teleop are included in the launches.
To adapt for your robot or camera:
- Update frame names (~base_frame, ~color_optical_frame) and static transforms in the launch files.
- Replace YOLO weights (e.g., yolov8m.pt or yolo11m.pt) via the ~model param.
- Adjust detection classes and patch sizes via node params.
- For end-effector intent, ensure /robot/limb/right/endpoint_state exists or remap the topic.
Optimization components (src/envopt/map_elites.py) can be run standalone for workspace layout studies; ensure Python deps from requirements are installed and run it with Python to generate archives and visualizations.

Citation

If you use this code in academic work, please cite the accompanying paper ().

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
config		config
launch		launch
src		src
srv		srv
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.xml		package.xml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tabletop Workspace Optimization

System Requirements

Installation Guide

Demo

Instructions for Use

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

cairo-robotics/tabletop_workspace_opt

Folders and files

Latest commit

History

Repository files navigation

Tabletop Workspace Optimization

System Requirements

Installation Guide

Demo

Instructions for Use

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages