make_constraint: traverse dof tree when constructing contact constraints and is sparse #929

thowell · 2025-12-16T15:33:53Z

this pr is part of the effort to implement sparse Jacobians #88

when is_sparse==True, instead of iterating over all dofs to construction contact constraints, traverse dof tree for each contact bodies

mujoco reference: https://github.com/google-deepmind/mujoco/blob/08b4b4144d70c69206f96cf329d5044ae686a1e6/src/engine/engine_core_util.c#L55

humanoid

performance for dense should be unchanged

mjwarp-testspeed ./benchmark/humanoid/humanoid.xml --nconmax=24 --njmax=64 --nworld=8192 --event_trace=True

this pr:

Summary for 8192 parallel rollouts

Total JIT time: 0.33 s
Total simulation time: 2.96 s
Total steps per second: 2,767,435
Total realtime factor: 13,837.18 x
Total time per step: 361.35 ns
Total converged worlds: 8192 / 8192

step: 359.65
  forward: 357.14
    fwd_position: 89.85
      kinematics: 16.37
      com_pos: 5.85
      camlight: 1.75
      flex: 0.17
      crb: 13.22
      tendon_armature: 0.17
      collision: 9.35
        nxn_broadphase: 3.71
        convex_narrowphase: 0.17
        primitive_narrowphase: 4.57
      make_constraint: 39.00

main (bb81495):

Total JIT time: 0.32 s
Total simulation time: 2.96 s
Total steps per second: 2,767,848
Total realtime factor: 13,839.24 x
Total time per step: 361.29 ns
Total converged worlds: 8192 / 8192

step: 359.55
  forward: 357.04
    fwd_position: 89.84
      kinematics: 16.36
      com_pos: 5.85
      camlight: 1.75
      flex: 0.17
      crb: 13.23
      tendon_armature: 0.17
      collision: 9.35
        nxn_broadphase: 3.71
        convex_narrowphase: 0.18
        primitive_narrowphase: 4.57
      make_constraint: 38.99

performance for sparse should be improved

mjwarp-testspeed ./benchmark/humanoid/humanoid.xml --nconmax=24 --njmax=64 --nworld=8192 --event_trace=True -o "opt.is_sparse=True"

this pr:

Total JIT time: 0.81 s
Total simulation time: 3.10 s
Total steps per second: 2,641,695
Total realtime factor: 13,208.47 x
Total time per step: 378.54 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 376.86
  forward: 374.33
    fwd_position: 77.87
      kinematics: 16.36
      com_pos: 5.84
      camlight: 1.74
      flex: 0.17
      crb: 9.98
      tendon_armature: 0.17
      collision: 9.34
        nxn_broadphase: 3.71
        convex_narrowphase: 0.17
        primitive_narrowphase: 4.57
      make_constraint: 30.65

main (bb81495):

Total JIT time: 0.25 s
Total simulation time: 3.17 s
Total steps per second: 2,586,971
Total realtime factor: 12,934.85 x
Total time per step: 386.55 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 384.86
  forward: 382.33
    fwd_position: 86.29
      kinematics: 16.36
      com_pos: 5.84
      camlight: 1.74
      flex: 0.17
      crb: 9.99
      tendon_armature: 0.17
      collision: 9.34
        nxn_broadphase: 3.71
        convex_narrowphase: 0.17
        primitive_narrowphase: 4.56
      make_constraint: 38.73

notes:

performance should be further improved once efc.J is represented in a sparse format and it is not necessary to zero memory on each call to make_constraint
replacing repeated code in contact_pyramidal and contact_elliptic with a wp.func introduced overhead? as a result, to maintain performance for now there is duplicated code in dense and sparse cases.

todo

improve tree traversal performance with dense

adenzler-nvidia · 2025-12-17T09:20:40Z

I think even with dense jacobians, the tree traversal strategy could be faster? What do you think? We need to zero the memory though, but I'm sure we can find a good way to do it.

thowell · 2025-12-19T08:18:50Z

@adenzler-nvidia yes, i think the tree traversal could be faster with dense. added a todo for making the dense version performant with tree traversal

erikfrey · 2025-12-27T19:00:38Z

@thowell usually I'm in favor of incremental changes and TODOs, but in this case we're adding some complexity (cache kernel, wp.static etc) that we might remove if it turns out that dof tree traversal makes sense for both dense and sparse.

Would you mind having a go at seeing whether it helps in both cases and if so we can simplify the changes in this PR?

In general I'm not a huge fan of cache_kernel, nested_kernel - I really try to use them sparingly when there's no other choice.

thowell · 2026-01-05T19:07:30Z

@erikfrey i think ultimately all of the constraint functions will need this complexity in order to support dense and sparse without additional overhead, see #934

thowell · 2026-01-10T14:05:56Z

one reason why it might not make sense to perform dof traversal for dense is that the efc.J row needs to have elements not visited by dof traversal zeroed. the dof traversal and zeroing might be more expensive that simply iterating over all dofs. in the sparse case it is not necessary to zero elements.

traverse dof tree

660e2c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

make_constraint: traverse dof tree when constructing contact constraints and is sparse #929

make_constraint: traverse dof tree when constructing contact constraints and is sparse #929

Uh oh!

thowell commented Dec 16, 2025 •

edited

Loading

Uh oh!

adenzler-nvidia commented Dec 17, 2025

Uh oh!

thowell commented Dec 19, 2025 •

edited

Loading

Uh oh!

erikfrey commented Dec 27, 2025

Uh oh!

thowell commented Jan 5, 2026

Uh oh!

thowell commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

make_constraint: traverse dof tree when constructing contact constraints and is sparse #929

Are you sure you want to change the base?

make_constraint: traverse dof tree when constructing contact constraints and is sparse #929

Uh oh!

Conversation

thowell commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adenzler-nvidia commented Dec 17, 2025

Uh oh!

thowell commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikfrey commented Dec 27, 2025

Uh oh!

thowell commented Jan 5, 2026

Uh oh!

thowell commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thowell commented Dec 16, 2025 •

edited

Loading

thowell commented Dec 19, 2025 •

edited

Loading