Skip to content

[Feature]: Enable Zero-Copy Memory Sharing between Taichi and PyTorchย #246

@pjw971022

Description

@pjw971022

What feature or enhancement are you proposing?

Implement DLPack protocol support in Taichi to enable zero-copy memory sharing between Taichi ndarrays and PyTorch tensors. This would allow seamless, efficient data exchange for Genesis simulations integrated with deep learning pipelines.

Specifically requesting:

  1. DLPack protocol implementation: __dlpack__() and __dlpack_device__() methods for Taichi ndarrays
  2. Helper functions: ti.from_dlpack() to import PyTorch/JAX/CuPy arrays without copying
  3. Alternative: __cuda_array_interface__ protocol support as a fallback

Motivation

Currently, Genesis simulations using Taichi require deep copying when exchanging data with PyTorch:

  • to_torch() and from_torch() create full data copies
  • This causes severe bottlenecks in RL training loops where observations/actions are exchanged every step
  • Memory usage is doubled unnecessarily
  • GPU utilization is suboptimal due to unnecessary synchronization

Competitive landscape: Other physics simulators already support zero-copy:

Without zero-copy support, Genesis/Taichi faces a significant performance disadvantage for ML workloads.

Potential Benefit

data transfer for large tensors (e.g., link/dof state,point clouds, contact forces)

What is the expected outcome of the implementation work?

The implementation should enable seamless zero-copy memory sharing between Taichi and PyTorch, following the industry-standard DLPack protocol.

Expected outcomes:

  1. DLPack Protocol Support: Taichi ndarrays should be able to export and import tensors using the DLPack standard, enabling interoperability with PyTorch, JAX, CuPy, and other compliant frameworks without data copying.

  2. Bidirectional Zero-Copy: Users should be able to convert Taichi ndarrays to PyTorch tensors and vice versa without memory duplication, similar to how PyTorch and CuPy currently interoperate.

  3. Transparent Integration: The zero-copy mechanism should work seamlessly with existing Genesis workflows, requiring minimal code changes while providing significant performance improvements in simulation-learning pipelines.

  4. Device Compatibility: The solution should support both CPU and CUDA backends, with proper handling of device placement and memory synchronization to prevent race conditions.

The ultimate goal is to eliminate data transfer bottlenecks in Genesis+Taichi workflows, making them competitive with or superior to other physics simulators that already support efficient tensor sharing.

Additional information

Related Issues

Technical Foundation

  • Taichi already has physical memory pointer access (code ref)
  • ti.types.ndarray() can already accept PyTorch tensors in kernels (read-only)
  • PyTorch, CuPy, JAX all have mature DLPack implementations as reference

Implementation References

Note: If this feature already exists or there are specific guidelines for achieving zero-copy between Taichi and PyTorch that I'm not aware of, please let me know. I've searched through the documentation and issues but couldn't find a working solution. Any guidance would be greatly appreciated. Thank you!

We are willing to contribute to implementation and testing, particularly for Genesis integration validation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions