Remove panics and optimise mesh picking #16148

tbillington · 2024-10-29T01:33:51Z

Objective

Adds tests to cover various mesh picking cases and removes sources of panics.

It should prevent users being able to trigger panics in bevy_picking code via bad mesh data such as #15891, and is a follow up to my comments in #15800 (review).

This is motivated by #15979

Testing

Adds 8 new tests to cover ray_mesh_intersection code.

Benchmarks

Included benchmarks done on m1 pro laptop. I also tested on an i5-4590 with Windows 11 and the results were in the same direction, albeit showing a bigger improvement over main than the mac.

elem/s measures triangles tested per second. For meshes with indices, it is indices.len() / 3, for meshes without it is positions.len() / 3.

To compare this pr vs main I first ran cargo bench --bench ray_mesh_intersection on this pr branch, then I copied the previous contents of intersection.rs into my branch and re-ran the bench cmd to get the exact same benchmark suite with the previous implementation.

Existing

Existing benchmarks are all significantly faster than main.

ray_mesh_intersection

bench	main	pr
`162 triangles`	`1.3025 µs \| 124.37 Melem/s`	`717.03 ns \| 225.93 Melem/s`
`19_602 triangles`	`112.26 µs \| 174.61 Melem/s`	`62.717 µs \| 312.55 Melem/s`
`1_996_002 triangles`	`11.166 ms \| 178.76 Melem/s`	`6.1663 ms \| 323.69 Melem/s`

ray_mesh_intersection_no_cull

bench	main	pr
`162 triangles`	`1.3345 µs \| 121.39 Melem/s`	`716.67 ns \| 226.05 Melem/s`
`19_602 triangles`	`113.84 µs \| 172.19 Melem/s`	`63.624 µs \| 308.09 Melem/s`
`1_996_002 triangles`	`11.373 ms \| 175.50 Melem/s`	`6.3036 ms \| 316.64 Melem/s`

ray_mesh_intersection_no_intersection

bench	main	pr
`162 triangles`	`1.2987 µs \| 124.74 Melem/s`	`707.89 ns \| 228.85 Melem/s`
`19_602 triangles`	`112.60 µs \| 174.08 Melem/s`	`62.372 µs \| 314.27 Melem/s`
`1_996_002 triangles`	`11.255 ms \| 177.35 Melem/s`	`6.1949 ms \| 322.20 Melem/s`

New

overlapping_planes_back_to_front is on par with main, though faster for tiny meshes. It generates 10, 1_1000, 100_000 quads that are ordered such that each subsequent quad tested will be "the new closest" hit. It's quite a terrible scenario, but it's good to know the worst case performance.

bench	main	pr
`20 triangles`	`307.68 ns \| 65.002 Melem/s`	`130.08 ns \| 153.75 Melem/s`
`2_000 triangles`	`27.947 µs \| 71.564 Melem/s`	`29.579 µs \| 67.616 Melem/s`
`200_000 triangles`	`2.8266 ms \| 70.756 Melem/s`	`2.8997 ms \| 68.973 Melem/s`

overlapping_planes_front_to_back is significantly faster than main. It generates 10, 1_1000, 100_000 quads that are ordered such that the first quad is recognised as the "closest hit", and while each subsequent quad tested will be hit, it will fail the distance comparison to the "current closest". It's not quite as bad a scenario as the previous benchmark, but it's still pretty bad.

bench	main	pr
`20 triangles`	`307.79 ns \| 64.979 Melem/s`	`132.76 ns \| 150.65 Melem/s`
`2_000 triangles`	`27.935 µs \| 71.594 Melem/s`	`10.241 µs \| 195.30 Melem/s`
`200_000 triangles`	`2.8140 ms \| 71.073 Melem/s`	`1.0264 ms \| 194.85 Melem/s`

ray_mesh_intersection_single_plane_indices is significantly faster than main.

bench	main	pr
`242 triangles`	`1.4731 µs \| 164.28 Melem/s`	`851.79 ns \| 284.11 Melem/s`
`20_402 triangles`	`114.88 µs \| 177.60 Melem/s`	`62.320 µs \| 327.37 Melem/s`
`2_004_002 triangles`	`11.365 ms \| 176.33 Melem/s`	`6.1316 ms \| 326.83 Melem/s`

ray_mesh_intersection_single_plane_no_indices is significantly faster than main.

bench	main	pr
`242 triangles`	`1.2379 µs \| 195.49 Melem/s`	`800.99 ns \| 302.13 Melem/s`
`20_402 triangles`	`95.915 µs \| 212.71 Melem/s`	`58.634 µs \| 347.96 Melem/s`
`2_004_002 triangles`	`9.5208 ms \| 210.49 Melem/s`	`5.8238 ms \| 344.11 Melem/s`

BenjaminBrienen

It looks good in itself, but I'm wondering whether the function should return a Result instead. It has multiple failure modes.

aevyrie · 2024-10-29T02:05:53Z

On an M3 Max, I'm seeing 3% regressions to 1% improvements. Probably noise.

tbillington · 2024-10-29T02:07:15Z

Yea I wondered that too. I would lean towards this ray_mesh_intersection returning a Result with a simple error enum. That way calling code in bevy_picking can decide to use it or ignore it where it makes sense, but users would have the option to do the intersection themselves and handle errors.

Admittedly, malformed meshes are probably not too common. I'm kinda on the fence about whether the default public api for mesh picking should bother with them or not.

I imagine it would be frustrating if some of your meshes "aren't working" when you try to click on them. To track it down to a malformed mesh might be a little bit of a journey. Especially if you have some kind of user generated content and malformed meshes are making it into the game.

crates/bevy_picking/src/mesh_picking/ray_cast/intersections.rs

aevyrie · 2024-10-29T02:13:14Z

I imagine it would be frustrating if some of your meshes "aren't working" when you try to click on them. To track it down to a malformed mesh might be a little bit of a journey. Especially if you have some kind of user generated content and malformed meshes are making it into the game.

It might be better to continue the loop instead of early exiting. That way if a single triangle is degenerate, it doesn't completely break raycasting.

tbillington · 2024-10-29T02:14:32Z

It might be better to continue the loop instead of early exiting. That way if a single triangle is degenerate, it doesn't completely break raycasting.

I'm open to the idea, but I think it's equally as likely to cause false positives.

aevyrie · 2024-10-29T02:16:40Z

I think it makes sense for normals - if the positions are coming back fine, but the normals are broken, you can still cast against the mesh as if it had no normals.

BenjaminBrienen · 2024-10-29T02:17:20Z

If it's possible to make the parameter(s) more strongly typed to avoid dealing with certain failure modes (invalid mesh/triangle), that would be nice. Otherwise, this direction is fine.

BenjaminBrienen · 2024-10-29T02:24:42Z

Approving for now since it is an improvement

tbillington · 2024-10-29T02:25:52Z

@aevyrie the nesting order of the Option<Option<>> is wrong for the ?, that's why I had .transpose before to swap them.

With the suggestions the tests start failing

tbillington · 2024-10-29T02:29:41Z

If it's possible to make the parameter(s) more strongly typed to avoid dealing with certain failure modes (invalid mesh/triangle), that would be nice. Otherwise, this direction is fine.

I like the idea of a pre-validated Mesh*, but that's a much bigger change :P

aevyrie · 2024-10-29T02:56:07Z

That's because it is no longer failing when the normals are missing. Instead, is is continuing to raycast, ignoring the normals - I think this is better behavior considering normals are already optional. There is one mistake though. The map()?; should be and_then(); That was causing the simple raycast test to fail.

tbillington · 2024-10-29T03:43:10Z

Ah, the combo of the two was confusing me. I didn't realise normals weren't used for hit detection, I care a lot less about wanting to short circuit if they're wrong now 👍 .

Added a benchmark for the no indices case since it wasn't represented. I tried a few ways of indexing and this had the best perf.

tbillington · 2024-10-29T04:12:06Z

Compared to main now, this branch has between 41-47% worse performance. I'm not 100% sure why the difference is so big from my initial commit of 0-10% improvement to now.

I haven't spend much time attempting to optimise anything, so there's likely some opportunities here. For example only fetching normals if there is a triangle intersection instead of before, etc.

I still think this is a preferable direction. For one, this is a new feature to bevy, so there isn't an established/expected performance baseline. Of course this code should be as fast as possible, but ideally users generate optimised colliders and cast against them in a proper physics sense.

aevyrie · 2024-10-29T04:27:02Z

I don't think we can make any sacrifices in perf here, it is already slower than it should be, and 40-50% is a huge hit. I can poke this a bit more and see if I can find a solution as well.

I didn't realise normals weren't used for hit detection

Yup! Normals are only used to compute the smooth normal used for lighting, but it doesn't change the actual hit position in space. If triangle normals are missing, it uses the face of the triangle for the normal.

I'm going to try an alternate method here. I think the goal of not crashing due to bad indices is a good one, but I don't think sacrificing perf is an option here, because it is on such a hot path.

tbillington · 2024-10-29T05:16:52Z

Do you have other benchmarks you're going off? All I can find is a few in benches.

As far as I've seen there hasn't been any optimisation done since merging, so I'm curious about the pushback.
How can it be a new feature but already we can't compromise any perf for correctness? Especially if this is going to be used in the bevy editor too. Any bad asset or buggy procedural generation will crash the editor and/or the game. I'm really struggling to understand the rationale.

I think there are quite a few options we have, I mentioned deferring normal resolution until a hit is found. It seems like it would be amenable to some level of parallelization too. Not to mention switching to optimised colliders/LOD over full meshes, or any kind of culling.

Thoughts on an "unchecked_picking" feature flag that would avoid the checked indexing for people who are less risk averse?

aevyrie · 2024-10-29T05:21:16Z

As far as I've seen there hasn't been any optimisation done since merging, so I'm curious about the pushback.

I wrote the original impl in mod_raycast. This is in the hottest of hot loops. Any slowdown here is multiplied by any meshes that have their AABB hit. There have been 2ish perf improvements since this was merged. The wins are usually on the order of 5%-20% improvements, it's pretty hard to claw performance here.

Any bad asset or buggy procedural generation will crash the editor and/or the game. I'm really struggling to understand the rationale.

This isn't acceptable, I agree, but there should be a way to avoid the crashes without halving performance.

be amenable to some level of parallelization too. Not to mention switching to optimised colliders/LOD over full meshes, or any kind of culling.

The mesh picking backend already heavily parallelizes and culls using AABBs, that's the only way this is performant at all. At the end of the day, this isn't really a full solution for the editor, it's a stopgap until we have a physics engine integrated, and we can use real acceleration structures, that or a shader picking backend.

tbillington · 2024-10-29T05:37:35Z

There have been 2ish perf improvements since this was merged.

I didn't look very hard then, woops!

Ah, ok good context :) might be cool if we added a picking stress test example so we have something more concrete for contributors to measure against. From my perspective all I see is microbench perf in an unreleased feature so it's hard to evaluate perf in any real sense.

Good to hear. I guess I was looking too closely at just these functions, didn't see the outer parallelization!

Agree on the physics engine bit!

tbillington · 2024-10-29T06:51:09Z

I might have some time to poke around over the week. If you have any suggestions I could try them out, otherwise will just bash my head against it xD

aevyrie · 2024-10-29T18:17:59Z

From my perspective all I see is microbench perf in an unreleased feature so it's hard to evaluate perf in any real sense.

That is completely understandable. 🙂 It would be nice to add benchmarks that test the full raycast as well, but when this benchmark was made, there was not parallelization or culling, so it really was everything.

For the record, I think the tests you've added here are invaluable, and I don't mean to put a damper on things. Feel free to ignore nits about code brevity, as long as performance hasn't regressed, this should be merged.

tbillington · 2024-12-19T02:57:01Z

Any more thoughts @aevyrie ? If you prefer I could try and source another reviewer if you don't have the bandwidth available !

aevyrie

Sorry, this fell off my radar.

The benchmarks and tests are a straight upgrade, very appreciated, no notes.
The changes to the intersection code still gives me pause. The only thing I'd like to see before merging is testing this in a real scene, and the closest we have is the caldera hotel. I'd say that as long as we aren't regressing in this case, I'm happy to approve and merge.
- In all likelihood, performance will be unchanged, but I'd really like to see a representative case tested before merging. This is a hot path and small tweaks can have surprising impacts to performance, so I'm being pretty cautious.
- I'm happy to help with this, I can also try doing this myself. I think the easiest way to test this will be using the mesh picking backend, loading in the caldera hotel gltf, and running tracy, while mousing over the same spot in the scene.

aevyrie

I hooked up bevy's scene viewer, loaded caldera hotel, and compared main to your branch, and your branch seems to be faster!

main

PR:

I also verified raycasting was actually working:

tbillington · 2025-01-06T15:18:20Z

The changes to the intersection code still gives me pause. The only thing I'd like to see before merging is testing this in a real scene, and the closest we have is the caldera hotel. I'd say that as long as we aren't regressing in this case, I'm happy to approve and merge.

In all likelihood, performance will be unchanged, but I'd really like to see a representative case tested before merging. This is a hot path and small tweaks can have surprising impacts to performance, so I'm being pretty cautious.

I'm happy to help with this, I can also try doing this myself. I think the easiest way to test this will be using the mesh picking backend, loading in the caldera hotel gltf, and running tracy, while mousing over the same spot in the scene.

Yea, we could possibly create some kind of picking stress test, but the trouble is having realistic test data, so it's cool you could test with an actual scene.

alice-i-cecile · 2025-01-06T18:31:03Z

@tbillington @BD103 I'd like to merge this; ping me when merge conflicts are resolved please :)

tbillington · 2025-01-10T03:13:42Z

Not finished yet, but while converting the the overlapping_planes benchmarks the test was failing. Even though I initially thought about removing it as aevyrie suggested, I wanted to make sure the test at least passed so there was less chance of something funky from my refactor.

I tried for about an hour to figure out why, it seems moving the y position of the vertices of the plane would cause the ray to never intersect. I can't work out if something is being rotated somewhere, or something else is going on so I'm a little hesitant until I can figure out exactly what is wrong, whether it's just the silly mesh generation code I have or if it's actually something related to my picking refactor.

tbillington · 2025-01-10T03:16:24Z

@BD103 I've covered your requests I'm pretty sure, though I stopped short of fully utilising your Benchmarks struct for abstracting even more since some of my benches added more variation in how things could work. I feel like I got most of the value from the helper macro_rules I wrote and passing things into the test functions, but open to suggestions!

The docs and refactor for p_to_x_norm()
The logarithmic scaling in the graphs
Correctly black_box()-ing all inputs
Testing that it does or does not intersect when cargo test --benches is run
The bench! macro naming

BD103 · 2025-01-10T13:13:49Z

Awesome! I'll do a thorough review of your changes later today, but I did notice some commented-out code. If Benchmarks doesn't fit your needs, feel free to discard it! I used it as a well to avoid duplicating code, but that was only because every benchmark tested the same thing with different values.

tbillington · 2025-01-11T02:22:27Z

Yea it was more of a work in progress commit to get y of ur eyes on it. But yes fortifying commented out stuff is next

BD103 · 2025-01-14T14:47:59Z

@tbillington In order to not hold up this PR, could you split the benchmark changes into a separate pull request? That way the actual optimizations can be merged while we work on the benchmarks separately.

alice-i-cecile · 2025-02-26T00:39:48Z

@tbillington gentle ping on this. I'd like to get this into 0.16, and I think it's largely ready.

alice-i-cecile · 2025-03-09T19:32:08Z

Cutting from the milestone and putting this up for adoption for now: it's been a couple of weeks without a response but I really do think that this work is valuable.

alice-i-cecile · 2025-03-10T17:31:11Z

Closing as mostly adopted. The benchmark changes are still desired in the future!

_Note from BD103: this PR was adopted from #16148. The majority of this PR's description is copied from the original._ # Objective Adds tests to cover various mesh picking cases and removes sources of panics. It should prevent users being able to trigger panics in `bevy_picking` code via bad mesh data such as #15891, and is a follow up to my comments in [#15800 (review)](#15800 (review)). This is motivated by #15979 ## Testing Adds 8 new tests to cover `ray_mesh_intersection` code. ## Changes from original PR I reverted the changes to the benchmarks, since that was the largest factor blocking it merging. I'll open a follow-up issue so that those benchmark changes can be implemented. --------- Co-authored-by: Trent <[email protected]>

remove raw indexing in mesh picking

34f3f80

spelling

f39f7cc

BenjaminBrienen reviewed Oct 29, 2024

View reviewed changes

aevyrie reviewed Oct 29, 2024

View reviewed changes

crates/bevy_picking/src/mesh_picking/ray_cast/intersections.rs Outdated Show resolved Hide resolved

aevyrie reviewed Oct 29, 2024

View reviewed changes

crates/bevy_picking/src/mesh_picking/ray_cast/intersections.rs Outdated Show resolved Hide resolved

simplify result nesting

276feb4

BenjaminBrienen approved these changes Oct 29, 2024

View reviewed changes

continue picking through missing vertex normals, add bench

39952c7

tbillington changed the title ~~Add tests and remove sources of panics in mesh picking~~ Remove panics and optimise mesh picking Nov 14, 2024

Merge remote-tracking branch 'origin' into ray_mesh_intersection_fixes

60b6c51

aevyrie reviewed Jan 6, 2025

View reviewed changes

aevyrie approved these changes Jan 6, 2025

View reviewed changes

alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jan 6, 2025

tbillington added 2 commits January 8, 2025 13:34

Merge branch 'main' into ray_mesh_intersection_fixes

bc68302

integrate picking benchmark overhaul changes from bevyengine#17033

e3940a0

BD103 added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it labels Jan 10, 2025

BD103 self-requested a review January 10, 2025 13:13

alice-i-cecile added this to the 0.16 milestone Jan 14, 2025

alice-i-cecile removed this from the 0.16 milestone Mar 9, 2025

alice-i-cecile added S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Mar 9, 2025

BD103 mentioned this pull request Mar 10, 2025

(Adoped) Remove panics and optimise mesh picking #18232

Merged

alice-i-cecile closed this Mar 10, 2025

BD103 mentioned this pull request Mar 11, 2025

Add additional benchmarks to picking::ray_mesh_intersection #18254

Open

Uh oh!

Remove panics and optimise mesh picking #16148

Remove panics and optimise mesh picking #16148

Uh oh!

Conversation

tbillington commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Testing

Benchmarks

Existing

New

Uh oh!

BenjaminBrienen left a comment

Choose a reason for hiding this comment

Uh oh!

aevyrie commented Oct 29, 2024

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

Uh oh!

Uh oh!

aevyrie commented Oct 29, 2024

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

aevyrie commented Oct 29, 2024

Uh oh!

BenjaminBrienen commented Oct 29, 2024

Uh oh!

BenjaminBrienen commented Oct 29, 2024

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

aevyrie commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

aevyrie commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

aevyrie commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Oct 29, 2024

Uh oh!

aevyrie commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Dec 19, 2024

Uh oh!

aevyrie left a comment

Choose a reason for hiding this comment

Uh oh!

aevyrie left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tbillington commented Jan 6, 2025

Uh oh!

alice-i-cecile commented Jan 6, 2025

Uh oh!

tbillington commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbillington commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BD103 commented Jan 10, 2025

tbillington commented Oct 29, 2024 •

edited

Loading

aevyrie commented Oct 29, 2024 •

edited

Loading

tbillington commented Oct 29, 2024 •

edited

Loading

aevyrie commented Oct 29, 2024 •

edited

Loading

aevyrie commented Oct 29, 2024 •

edited

Loading

tbillington commented Oct 29, 2024 •

edited

Loading

aevyrie commented Oct 29, 2024 •

edited

Loading

aevyrie left a comment •

edited

Loading

tbillington commented Jan 10, 2025 •

edited

Loading

tbillington commented Jan 10, 2025 •

edited

Loading