Should SceneEntityCfg ID collections become on-device tensors?

Currently, when we resolve a `SceneEntityCfg` with a set of body or joint or site names, we end up with `body_ids`, `joint_ids`, etc. as a Python `list[int]`. We typically then do something like:

```py
# Grab the positions of the joints I care about
entity.data.joint_pos[:, entity_cfg.joint_ids]
```

Because `entity_cfg.joint_ids` is a Python list, we necessarily hit this implicit CUDA synchronization: https://docs.nvidia.com/dl-cuda-graph/torch-cuda-graph/sync-free-code.html#indexing-tensors (specifically the `x_gpu[idx_list]  # cuStreamSynchronize, implicit blocking .to()` case). The `joint_ids` list needs to be moved into a brand-new tensor on the GPU every time we hit that line. This creates unnecessary allocations and synchronization steps. 

We've been working around this by ditching the functional reward/observation format and caching `self._joint_ids: Tensor` in a reward/observation class. But it would be pretty nice not to have to do that. 

Is there some performance trap that we're avoiding by using a plain `list[int]` here? Or could we switch this to tensors and get rid of these implicit CUDA syncs in common cases? 

cc @bd-pdomanico who pointed this out originally. 
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Should SceneEntityCfg ID collections become on-device tensors? #1019

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Should SceneEntityCfg ID collections become on-device tensors? #1019

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions