Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] self.model.enable_tri_collisions Flag #405

Open
ywu110 opened this issue Dec 21, 2024 · 2 comments
Open

[QUESTION] self.model.enable_tri_collisions Flag #405

ywu110 opened this issue Dec 21, 2024 · 2 comments
Labels
question The issue author requires information warp.sim Issue with the simulation module

Comments

@ywu110
Copy link

ywu110 commented Dec 21, 2024

Issue Overview:

I am working on a cloth simulation using the example provided here. In addition to handling collisions between the cloth and the Bunny model, I aim to incorporate cloth self-collision into the simulation.

Steps Taken:

  1. Enabled Self-Collision: I added the following line to the example code to enable cloth self-collision:
self.model.enable_tri_collision = True
  1. Observed Issues:
  • Persistent Penetration: Despite enabling self-collision, the simulation still exhibits significant penetration within the cloth itself.

  • Performance Degradation: Enabling self.model.enable_tri_collision drastically increased the rendering time from approximately 2ms to 800ms. Using wp.ScopedTimer, I traced the slowdown to the following line in warp/sim/render.py:

particle_q = state.particle_q.numpy()

Link to the specific line

Questions:

  1. Self-Collision Effectiveness:
  • Is adding self.model.enable_tri_collision = True sufficient to enable effective cloth self-collision?

  • If not, what additional steps or configurations are necessary to achieve reliable self-collision handling?

  1. Performance Impact:
  • Why does enabling self.model.enable_tri_collision cause such a significant increase in rendering time, especially when the shape of particle_q remains unchanged?
  • Are there recommended optimizations or alternative approaches to mitigate this performance hit while maintaining accurate self-collision?

Additional Information:

Warp Version: Both V1.3.0 and V1.5.0
Hardware: GPU 3090 CUDA 12.2 Driver Version: 535.183.01
Operating System: Ubuntu 22.04
Python Version: Python 3.10.13

Thank you in advance!!

I am keen to resolve these issues to advance my simulation work. Any guidance or suggestions you can provide would be greatly appreciated.
Thank you once again for developing and maintaining such an amazing project!

@ywu110 ywu110 added the question The issue author requires information label Dec 21, 2024
@shi-eric
Copy link
Contributor

shi-eric commented Dec 30, 2024

I can't speak about the enable_tri_collision flag, but I did want to comment on the performance analysis and caution about wp.ScopedTimer. Since the CUDA API is asynchronous, this can lead to misleading results when combined with traditional timing libraries in Python. Our profiling docs goes into this: https://nvidia.github.io/warp/profiling.html

In your specific case, the hit in performance isn't coming from state.particle_q.numpy(). Instead, it seems that the eval_triangles_contact kernel is quite expensive (

@wp.kernel
def eval_triangles_contact(
# idx : wp.array(dtype=int), # list of indices for colliding particles
num_particles: int, # size of particles
x: wp.array(dtype=wp.vec3),
v: wp.array(dtype=wp.vec3),
indices: wp.array2d(dtype=int),
materials: wp.array2d(dtype=float),
particle_radius: wp.array(dtype=float),
f: wp.array(dtype=wp.vec3),
):
tid = wp.tid()
face_no = tid // num_particles # which face
particle_no = tid % num_particles # which particle
# k_mu = materials[face_no, 0]
# k_lambda = materials[face_no, 1]
# k_damp = materials[face_no, 2]
# k_drag = materials[face_no, 3]
# k_lift = materials[face_no, 4]
# at the moment, just one particle
pos = x[particle_no]
i = indices[face_no, 0]
j = indices[face_no, 1]
k = indices[face_no, 2]
if i == particle_no or j == particle_no or k == particle_no:
return
p = x[i] # point zero
q = x[j] # point one
r = x[k] # point two
# vp = v[i] # vel zero
# vq = v[j] # vel one
# vr = v[k] # vel two
# qp = q-p # barycentric coordinates (centered at p)
# rp = r-p
bary = triangle_closest_point_barycentric(p, q, r, pos)
closest = p * bary[0] + q * bary[1] + r * bary[2]
diff = pos - closest
dist = wp.dot(diff, diff)
n = wp.normalize(diff)
c = wp.min(dist - particle_radius[particle_no], 0.0) # 0 unless within particle's contact radius
# c = wp.leaky_min(dot(n, x0)-0.01, 0.0, 0.0)
fn = n * c * 1e5
wp.atomic_sub(f, particle_no, fn)
# # apply forces (could do - f / 3 here)
wp.atomic_add(f, i, fn * bary[0])
wp.atomic_add(f, j, fn * bary[1])
wp.atomic_add(f, k, fn * bary[2])
), adding a kernel that takes ~28 ms onto a substep that used to have ~0.02 ms of GPU operations!

To make these conclusions, I needed to modify the example to:

  • Have synchronize=True to get more accurate timings for the step and render timers.
  • Turn off CUDA graphs to get increased granularity in the timeline when profiled with Nsight Systems. Not using CUDA graphs does have a negative impact on performance, but we already know what the overall performance hit of using self.model.enable_tri_collisions = True was.

In this image, you can see that the GPU activity is dominated by these eval_triangles_contact calls:
Image

@shi-eric shi-eric changed the title [QUESTION] self.model.enable_tri_collisions = True makes particle_q = state.particle_q.numpy() super slow [QUESTION] self.model.enable_tri_collisions Flag Dec 30, 2024
@AnkaChan
Copy link
Contributor

Hi 冰糖葫芦,
What integrator are you using? I assume you are using the Euler integrator, since that is the default option.
Enabling enable_tri_collision adds point-triangle collision response, which by itself is not very stable. Furthermore, the collision detection uses a brute-force search, leading to an O(𝑁^2)complexity.

That is the why the simulation becomes slower after turning on enable_tri_collision. The reason why rendering time goes up is because the simulate() function only launches the simulation kernel without waiting for it to finish. However, rendering requires synchronization, and most of the time of the rendering step is waiting for the launched kernels to finish.

To be honest, achieving stable collision responses in an explicit integrator is very challenging the least to say.
However, currently only the Euler integrator, which is an explicit integrator, supports self-collision. XPBD and VBD do not. But the good news is that we are planing to release a new version in January that introduces a robust self-collision handler for the VBD integrator, which we believe is the best solution for this issue. The new integrator will provide penetration-free guarantee for robust elasticity simulation.

@shi-eric shi-eric added the warp.sim Issue with the simulation module label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The issue author requires information warp.sim Issue with the simulation module
Projects
None yet
Development

No branches or pull requests

3 participants