-
Notifications
You must be signed in to change notification settings - Fork 6
Description
What feature or enhancement are you proposing?
Implement DLPack protocol support in Taichi to enable zero-copy memory sharing between Taichi ndarrays and PyTorch tensors. This would allow seamless, efficient data exchange for Genesis simulations integrated with deep learning pipelines.
Specifically requesting:
- DLPack protocol implementation:
__dlpack__()and__dlpack_device__()methods for Taichi ndarrays - Helper functions:
ti.from_dlpack()to import PyTorch/JAX/CuPy arrays without copying - Alternative:
__cuda_array_interface__protocol support as a fallback
Motivation
Currently, Genesis simulations using Taichi require deep copying when exchanging data with PyTorch:
to_torch()andfrom_torch()create full data copies- This causes severe bottlenecks in RL training loops where observations/actions are exchanged every step
- Memory usage is doubled unnecessarily
- GPU utilization is suboptimal due to unnecessary synchronization
Competitive landscape: Other physics simulators already support zero-copy:
- MJLab (MuJoCo) appears to support efficient zero-copy tensor sharing: https://github.com/mujocolab/mjlab
Without zero-copy support, Genesis/Taichi faces a significant performance disadvantage for ML workloads.
Potential Benefit
data transfer for large tensors (e.g., link/dof state,point clouds, contact forces)
What is the expected outcome of the implementation work?
The implementation should enable seamless zero-copy memory sharing between Taichi and PyTorch, following the industry-standard DLPack protocol.
Expected outcomes:
-
DLPack Protocol Support: Taichi ndarrays should be able to export and import tensors using the DLPack standard, enabling interoperability with PyTorch, JAX, CuPy, and other compliant frameworks without data copying.
-
Bidirectional Zero-Copy: Users should be able to convert Taichi ndarrays to PyTorch tensors and vice versa without memory duplication, similar to how PyTorch and CuPy currently interoperate.
-
Transparent Integration: The zero-copy mechanism should work seamlessly with existing Genesis workflows, requiring minimal code changes while providing significant performance improvements in simulation-learning pipelines.
-
Device Compatibility: The solution should support both CPU and CUDA backends, with proper handling of device placement and memory synchronization to prevent race conditions.
The ultimate goal is to eliminate data transfer bottlenecks in Genesis+Taichi workflows, making them competitive with or superior to other physics simulators that already support efficient tensor sharing.
Additional information
Related Issues
- Taichi #5057: Long-standing request for zero-copy support (since 2022)
Support common array interfaces in python for zero-copy data sharingย taichi-dev/taichi#5057
The discussion confirms technical feasibility with existing memory pointer access.
Technical Foundation
- Taichi already has physical memory pointer access (code ref)
ti.types.ndarray()can already accept PyTorch tensors in kernels (read-only)- PyTorch, CuPy, JAX all have mature DLPack implementations as reference
Implementation References
Note: If this feature already exists or there are specific guidelines for achieving zero-copy between Taichi and PyTorch that I'm not aware of, please let me know. I've searched through the documentation and issues but couldn't find a working solution. Any guidance would be greatly appreciated. Thank you!
We are willing to contribute to implementation and testing, particularly for Genesis integration validation.