Skip to content

Commit 1ac6246

Browse files
committed
Update rai_perception README.md and associated files
1 parent cfe37b1 commit 1ac6246

File tree

6 files changed

+760
-466
lines changed

6 files changed

+760
-466
lines changed

poetry.lock

Lines changed: 488 additions & 389 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,7 @@ rai_bench = {path = "src/rai_bench", develop = true}
5050
optional = true
5151

5252
[tool.poetry.group.perception.dependencies]
53-
torch = "^2.3.1"
54-
torchvision = "^0.18.1"
55-
rf-groundingdino = "^0.2.0"
56-
sam2 = { git = "https://github.com/RobotecAI/Grounded-SAM-2", branch = "main" }
53+
rai_perception = {path = "src/rai_extensions/rai_perception", develop = true}
5754

5855
[tool.poetry.group.nomad]
5956
optional = true

src/rai_extensions/rai_perception/README.md

Lines changed: 120 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -2,56 +2,108 @@
22

33
# RAI Perception
44

5-
This package provides a ROS 2 node that interfaces with the [Idea-Research GroundingDINO Model](https://github.com/IDEA-Research/GroundingDINO) for open-set object detection.
5+
This package provides ROS2 integration with [Idea-Research GroundingDINO Model](https://github.com/IDEA-Research/GroundingDINO) and [Grounded-SAM-2, RobotecAI fork](https://github.com/RobotecAI/Grounded-SAM-2) for object detection, segmentation, and gripping point calculation. The `GroundedSamAgent` and `GroundingDinoAgent` are ROS2 service nodes that can be readily added to ROS2 applications. It also provides tools that can be used with [RAI LLM agents](../../../docs/tutorials/walkthrough.md) to construct conversational scenarios.
66

7+
In addition to these building blocks, this package includes utilities to facilitate development, such as a ROS2 client that demonstrates interactions with agent nodes.
78

89
## Installation
910

10-
In your workspace you need to have an `src` folder containing this package `rai_perception` and the `rai_interfaces` package.
11+
While installing `rai_perception` via Pip is being actively worked on, to incorporate it into your application, you will need to set up a ROS2 workspace.
1112

12-
### Preparing the GroundingDINO
13+
### ROS2 Workspace Setup
1314

14-
Add required ROS dependencies:
15+
Create a ROS2 workspace and copy this package:
1516

16-
```
17-
rosdep install --from-paths src --ignore-src -r
17+
```bash
18+
mkdir -p ~/rai_perception_ws/src
19+
cd ~/rai_perception_ws/src
20+
21+
# only checkout rai_perception package
22+
# TODO:juliaj, update branch to main!
23+
git clone --depth 1 --branch jj/feat/rai-perception-pkg https://github.com/RobotecAI/rai.git temp
24+
cd temp
25+
git archive --format=tar --prefix=rai_perception/ HEAD:src/rai_extensions/rai_perception | tar -xf -
26+
mv rai_perception ../rai_perception
27+
cd ..
28+
rm -rf temp
1829
```
1930

20-
## Build and run
31+
### ROS2 Dependencies
2132

22-
In the base directory of the `RAI` package install dependencies:
33+
Add required ROS dependencies. From the workspace root, run
2334

24-
```
25-
poetry install --with perception
35+
```bash
36+
rosdep install --from-paths src --ignore-src -r
2637
```
2738

28-
Source the ros installation
39+
### Build and Run
2940

30-
```
31-
source /opt/ros/${ROS_DISTRO}/setup.bash
32-
```
41+
Source ROS2 and build:
3342

34-
Run the build process:
43+
```bash
44+
# Source ROS2 (humble or jazzy)
45+
source /opt/ros/${ROS_DISTRO}/setup.bash
3546

36-
```
47+
# Build workspace
48+
cd ~/rai_perception_ws
3749
colcon build --symlink-install
50+
51+
# Source ROS2 packages
52+
source install/setup.bash
3853
```
3954

40-
Source the environment
55+
### Python Dependencies
4156

42-
```
43-
source setup_shell.sh
57+
`rai_perception` depends on `rai-core` and `sam2`. There are many ways to set up a virtual environment and install these dependencies. Below, we provide an example using Poetry.
58+
59+
**Step 1:** Copy the following template to `pyproject.toml` in your workspace root, updating it according to your directory setup:
60+
61+
```toml
62+
# rai_perception_project pyproject template
63+
[tool.poetry]
64+
name = "rai_perception_ws"
65+
version = "0.1.0"
66+
description = "ROS2 workspace for RAI perception"
67+
package-mode = false
68+
69+
[tool.poetry.dependencies]
70+
python = "^3.10, <3.13"
71+
rai-core = ">=2.5.4"
72+
rai-perception = {path = "src/rai_perception", develop = true}
73+
74+
[build-system]
75+
requires = ["poetry-core>=1.0.0"]
76+
build-backend = "poetry.core.masonry.api"
4477
```
4578

46-
Run the `GroundedSamAgent` and `GroundingDinoAgent` agents.
79+
**Step 2:** Install dependencies:
4780

81+
First, we create Virtual Environment with Poetry:
82+
83+
```bash
84+
cd ~/rai_perception_ws
85+
poetry lock
86+
poetry install
4887
```
49-
python run_vision_agents.py
88+
89+
Now, we are ready to launch perception agents:
90+
91+
```bash
92+
# Activate virtual environment
93+
source "$(poetry env info --path)"/bin/activate
94+
export PYTHONPATH
95+
PYTHONPATH="$(dirname "$(dirname "$(poetry run which python)")")/lib/python$(poetry run python --version | awk '{print $2}' | cut -d. -f1,2)/site-packages:$PYTHONPATH"
96+
97+
# run agents
98+
python src/rai_perception/scripts/run_perception_agents.py
5099
```
51100

101+
> [!TIP]
102+
> To manage ROS 2 + Poetry environment with less friction: Keep build tools (colcon) at system level, use Poetry only for runtime dependencies of your packages.
103+
52104
<!--- --8<-- [end:sec1] -->
53105

54-
Agents create two ROS 2 Nodes: `grounding_dino` and `grounded_sam` using [ROS2Connector](../../../docs/API_documentation/connectors/ROS_2_Connectors.md).
106+
`rai-perception` agents create two ROS 2 nodes: `grounding_dino` and `grounded_sam` using [ROS2Connector](../../../docs/API_documentation/connectors/ROS_2_Connectors.md).
55107
These agents can be triggered by ROS2 services:
56108

57109
- `grounding_dino_classify`: `rai_interfaces/srv/RAIGroundingDino`
@@ -68,83 +120,109 @@ These agents can be triggered by ROS2 services:
68120
## RAI Tools
69121

70122
`rai_perception` package contains tools that can be used by [RAI LLM agents](../../../docs/tutorials/walkthrough.md)
71-
enhance their perception capabilities. For more information on RAI Tools see
123+
to enhance their perception capabilities. For more information on RAI Tools see
72124
[Tool use and development](../../../docs/tutorials/tools.md) tutorial.
73125

74-
<!--- --8<-- [start:sec3] -->
126+
<!--- --8<-- [start:sec2] -->
75127

76128
### `GetDetectionTool`
77129

78-
This tool calls the grounding dino service to use the model to see if the message from the provided camera topic contains objects from a comma separated prompt.
130+
This tool calls the GroundingDINO service to detect objects from a comma-separated prompt in the provided camera topic.
79131

80-
<!--- --8<-- [end:sec3] -->
132+
<!--- --8<-- [end:sec2] -->
81133

82134
> [!TIP]
83135
>
84136
> you can try example below with [rosbotxl demo](../../../docs/demos/rosbot_xl.md) binary.
85-
> The binary exposes `/camera/camera/color/image_raw` and `/camera/camera/depth/image_raw` topics.
137+
> The binary exposes `/camera/camera/color/image_raw` and `/camera/camera/depth/image_rect_raw` topics.
86138
87-
<!--- --8<-- [start:sec4] -->
139+
<!--- --8<-- [start:sec3] -->
88140

89141
**Example call**
90142

91143
```python
144+
import time
92145
from rai_perception.tools import GetDetectionTool
93146
from rai.communication.ros2 import ROS2Connector, ROS2Context
94147

95148
with ROS2Context():
96149
connector=ROS2Connector(node_name="test_node")
150+
151+
# Wait for topic discovery to complete
152+
print("Waiting for topic discovery...")
153+
time.sleep(3)
154+
97155
x = GetDetectionTool(connector=connector)._run(
98156
camera_topic="/camera/camera/color/image_raw",
99-
object_names=["chair", "human", "plushie", "box", "ball"],
157+
object_names=["bed", "bed pillow", "table lamp", "plant", "desk"],
100158
)
159+
print(x)
101160
```
102161

103162
**Example output**
104163

105164
```
106-
I have detected the following items in the picture - chair, human
165+
I have detected the following items in the picture plant, table lamp, table lamp, bed, desk
107166
```
108167

109168
### `GetDistanceToObjectsTool`
110169

111-
This tool calls the grounding dino service to use the model to see if the message from the provided camera topic contains objects from a comma separated prompt. Then it utilises messages from depth camera to create an estimation of distance to a detected object.
170+
This tool calls the GroundingDINO service to detect objects from a comma-separated prompt in the provided camera topic. Then it utilizes messages from the depth camera to estimate the distance to detected objects.
112171

113172
**Example call**
114173

115174
```python
116-
from rai_perception.tools import GetDetectionTool
175+
from rai_perception.tools import GetDistanceToObjectsTool
117176
from rai.communication.ros2 import ROS2Connector, ROS2Context
177+
import time
118178

119179
with ROS2Context():
120180
connector=ROS2Connector(node_name="test_node")
121-
connector.node.declare_parameter("conversion_ratio", 1.0) # scale parameter for the depth map
181+
connector.node.declare_parameter("conversion_ratio", 1.0) # scale parameter for the depth map
182+
183+
# Wait for topic discovery to complete
184+
print("Waiting for topic discovery...")
185+
time.sleep(3)
186+
122187
x = GetDistanceToObjectsTool(connector=connector)._run(
123188
camera_topic="/camera/camera/color/image_raw",
124189
depth_topic="/camera/camera/depth/image_rect_raw",
125-
object_names=["chair", "human", "plushie", "box", "ball"],
190+
object_names=["desk"],
126191
)
127192

193+
print(x)
128194
```
129195

130196
**Example output**
131197

132198
```
133-
I have detected the following items in the picture human: 3.77m away
199+
I have detected the following items in the picture desk: 2.43m away
134200
```
135201

136202
## Simple ROS2 Client Node Example
137203

138-
An example client is provided with the package as `rai_perception/talker.py`
204+
The `rai_perception/talker.py` example demonstrates how to use the perception services for object detection and segmentation. It shows the complete pipeline: GroundingDINO for object detection followed by GroundedSAM for instance segmentation, with visualization output.
205+
206+
This example is useful for:
139207

140-
You can see it working by running:
208+
- Testing perception services integration
209+
- Understanding the ROS2 service call patterns
210+
- Seeing detection and segmentation results with bounding boxes and masks
141211

212+
Run the example:
213+
214+
```bash
215+
cd ~/rai_perception_ws
216+
python src/rai_perception/scripts/run_perception_agents.py
142217
```
143-
python run_vision_agents.py
144-
cd rai # rai repo BASE directory
145-
ros2 run rai_perception talker --ros-args -p image_path:=src/rai_extensions/rai_perception/images/sample.jpg
218+
219+
In a different window, run
220+
221+
```bash
222+
cd ~/rai_perception_ws
223+
ros2 run rai_perception talker --ros-args -p image_path:=src/rai_perception/images/sample.jpg
146224
```
147225

148-
If everything was set up properly you should see a couple of detections with classes `dinosaur`, `dragon`, and `lizard`.
226+
The example will detect objects (dragon, lizard, dinosaur) and save a visualization with bounding boxes and masks to `masks.png`.
149227

150-
<!--- --8<-- [end:sec4] -->
228+
<!--- --8<-- [end:sec3] -->
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
[tool.poetry]
2+
name = "rai_perception"
3+
version = "0.1.0"
4+
description = "Package enabling perception capabilities for RAI"
5+
authors = ["Kajetan Rachwał <[email protected]>"]
6+
license = "Apache License 2.0"
7+
readme = "README.md"
8+
9+
[tool.poetry.dependencies]
10+
# TODO:(juliaj) update sam2 dependency after https://github.com/RobotecAI/Grounded-SAM-2/pull/3 is merged
11+
torch = "^2.3.1"
12+
torchvision = "^0.18.1"
13+
rf-groundingdino = "^0.2.0"
14+
sam2 = { git = "https://github.com/RobotecAI/Grounded-SAM-2", branch = "main" }
15+
16+
[build-system]
17+
requires = ["poetry-core>=1.0.0"]
18+
build-backend = "poetry.core.masonry.api"

0 commit comments

Comments
 (0)