Skip to content

Remove panics and optimise mesh picking #16148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

tbillington
Copy link
Contributor

@tbillington tbillington commented Oct 29, 2024

Objective

Adds tests to cover various mesh picking cases and removes sources of panics.

It should prevent users being able to trigger panics in bevy_picking code via bad mesh data such as #15891, and is a follow up to my comments in #15800 (review).

This is motivated by #15979

Testing

Adds 8 new tests to cover ray_mesh_intersection code.

Benchmarks

Included benchmarks done on m1 pro laptop. I also tested on an i5-4590 with Windows 11 and the results were in the same direction, albeit showing a bigger improvement over main than the mac.

elem/s measures triangles tested per second. For meshes with indices, it is indices.len() / 3, for meshes without it is positions.len() / 3.

To compare this pr vs main I first ran cargo bench --bench ray_mesh_intersection on this pr branch, then I copied the previous contents of intersection.rs into my branch and re-ran the bench cmd to get the exact same benchmark suite with the previous implementation.

Existing

Existing benchmarks are all significantly faster than main.

ray_mesh_intersection

bench main pr
162 triangles 1.3025 µs | 124.37 Melem/s 717.03 ns | 225.93 Melem/s
19_602 triangles 112.26 µs | 174.61 Melem/s 62.717 µs | 312.55 Melem/s
1_996_002 triangles 11.166 ms | 178.76 Melem/s 6.1663 ms | 323.69 Melem/s

ray_mesh_intersection_no_cull

bench main pr
162 triangles 1.3345 µs | 121.39 Melem/s 716.67 ns | 226.05 Melem/s
19_602 triangles 113.84 µs | 172.19 Melem/s 63.624 µs | 308.09 Melem/s
1_996_002 triangles 11.373 ms | 175.50 Melem/s 6.3036 ms | 316.64 Melem/s

ray_mesh_intersection_no_intersection

bench main pr
162 triangles 1.2987 µs | 124.74 Melem/s 707.89 ns | 228.85 Melem/s
19_602 triangles 112.60 µs | 174.08 Melem/s 62.372 µs | 314.27 Melem/s
1_996_002 triangles 11.255 ms | 177.35 Melem/s 6.1949 ms | 322.20 Melem/s

New

overlapping_planes_back_to_front is on par with main, though faster for tiny meshes. It generates 10, 1_1000, 100_000 quads that are ordered such that each subsequent quad tested will be "the new closest" hit. It's quite a terrible scenario, but it's good to know the worst case performance.

bench main pr
20 triangles 307.68 ns | 65.002 Melem/s 130.08 ns | 153.75 Melem/s
2_000 triangles 27.947 µs | 71.564 Melem/s 29.579 µs | 67.616 Melem/s
200_000 triangles 2.8266 ms | 70.756 Melem/s 2.8997 ms | 68.973 Melem/s

overlapping_planes_front_to_back is significantly faster than main. It generates 10, 1_1000, 100_000 quads that are ordered such that the first quad is recognised as the "closest hit", and while each subsequent quad tested will be hit, it will fail the distance comparison to the "current closest". It's not quite as bad a scenario as the previous benchmark, but it's still pretty bad.

bench main pr
20 triangles 307.79 ns | 64.979 Melem/s 132.76 ns | 150.65 Melem/s
2_000 triangles 27.935 µs | 71.594 Melem/s 10.241 µs | 195.30 Melem/s
200_000 triangles 2.8140 ms | 71.073 Melem/s 1.0264 ms | 194.85 Melem/s

ray_mesh_intersection_single_plane_indices is significantly faster than main.

bench main pr
242 triangles 1.4731 µs | 164.28 Melem/s 851.79 ns | 284.11 Melem/s
20_402 triangles 114.88 µs | 177.60 Melem/s 62.320 µs | 327.37 Melem/s
2_004_002 triangles 11.365 ms | 176.33 Melem/s 6.1316 ms | 326.83 Melem/s

ray_mesh_intersection_single_plane_no_indices is significantly faster than main.

bench main pr
242 triangles 1.2379 µs | 195.49 Melem/s 800.99 ns | 302.13 Melem/s
20_402 triangles 95.915 µs | 212.71 Melem/s 58.634 µs | 347.96 Melem/s
2_004_002 triangles 9.5208 ms | 210.49 Melem/s 5.8238 ms | 344.11 Melem/s

@BenjaminBrienen BenjaminBrienen added C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Straightforward Simple bug fixes and API improvements, docs, test and examples S-Needs-Review Needs reviewer attention (from anyone!) to move forward A-Picking Pointing at and selecting objects of all sorts labels Oct 29, 2024
Copy link
Contributor

@BenjaminBrienen BenjaminBrienen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good in itself, but I'm wondering whether the function should return a Result instead. It has multiple failure modes.

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

On an M3 Max, I'm seeing 3% regressions to 1% improvements. Probably noise.

@tbillington
Copy link
Contributor Author

Yea I wondered that too. I would lean towards this ray_mesh_intersection returning a Result with a simple error enum. That way calling code in bevy_picking can decide to use it or ignore it where it makes sense, but users would have the option to do the intersection themselves and handle errors.

Admittedly, malformed meshes are probably not too common. I'm kinda on the fence about whether the default public api for mesh picking should bother with them or not.

I imagine it would be frustrating if some of your meshes "aren't working" when you try to click on them. To track it down to a malformed mesh might be a little bit of a journey. Especially if you have some kind of user generated content and malformed meshes are making it into the game.

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

I imagine it would be frustrating if some of your meshes "aren't working" when you try to click on them. To track it down to a malformed mesh might be a little bit of a journey. Especially if you have some kind of user generated content and malformed meshes are making it into the game.

It might be better to continue the loop instead of early exiting. That way if a single triangle is degenerate, it doesn't completely break raycasting.

@tbillington
Copy link
Contributor Author

It might be better to continue the loop instead of early exiting. That way if a single triangle is degenerate, it doesn't completely break raycasting.

I'm open to the idea, but I think it's equally as likely to cause false positives.

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

I think it makes sense for normals - if the positions are coming back fine, but the normals are broken, you can still cast against the mesh as if it had no normals.

@BenjaminBrienen
Copy link
Contributor

If it's possible to make the parameter(s) more strongly typed to avoid dealing with certain failure modes (invalid mesh/triangle), that would be nice. Otherwise, this direction is fine.

@BenjaminBrienen
Copy link
Contributor

Approving for now since it is an improvement

@tbillington
Copy link
Contributor Author

@aevyrie the nesting order of the Option<Option<>> is wrong for the ?, that's why I had .transpose before to swap them.

With the suggestions the tests start failing
image

@tbillington
Copy link
Contributor Author

If it's possible to make the parameter(s) more strongly typed to avoid dealing with certain failure modes (invalid mesh/triangle), that would be nice. Otherwise, this direction is fine.

I like the idea of a pre-validated Mesh*, but that's a much bigger change :P

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

That's because it is no longer failing when the normals are missing. Instead, is is continuing to raycast, ignoring the normals - I think this is better behavior considering normals are already optional. There is one mistake though. The map()?; should be and_then(); That was causing the simple raycast test to fail.

@tbillington
Copy link
Contributor Author

tbillington commented Oct 29, 2024

Ah, the combo of the two was confusing me. I didn't realise normals weren't used for hit detection, I care a lot less about wanting to short circuit if they're wrong now 👍 .

Added a benchmark for the no indices case since it wasn't represented. I tried a few ways of indexing and this had the best perf.

@tbillington
Copy link
Contributor Author

Compared to main now, this branch has between 41-47% worse performance. I'm not 100% sure why the difference is so big from my initial commit of 0-10% improvement to now.

I haven't spend much time attempting to optimise anything, so there's likely some opportunities here. For example only fetching normals if there is a triangle intersection instead of before, etc.

I still think this is a preferable direction. For one, this is a new feature to bevy, so there isn't an established/expected performance baseline. Of course this code should be as fast as possible, but ideally users generate optimised colliders and cast against them in a proper physics sense.

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

I don't think we can make any sacrifices in perf here, it is already slower than it should be, and 40-50% is a huge hit. I can poke this a bit more and see if I can find a solution as well.

I didn't realise normals weren't used for hit detection

Yup! Normals are only used to compute the smooth normal used for lighting, but it doesn't change the actual hit position in space. If triangle normals are missing, it uses the face of the triangle for the normal.

I'm going to try an alternate method here. I think the goal of not crashing due to bad indices is a good one, but I don't think sacrificing perf is an option here, because it is on such a hot path.

@tbillington
Copy link
Contributor Author

Do you have other benchmarks you're going off? All I can find is a few in benches.

As far as I've seen there hasn't been any optimisation done since merging, so I'm curious about the pushback.
How can it be a new feature but already we can't compromise any perf for correctness? Especially if this is going to be used in the bevy editor too. Any bad asset or buggy procedural generation will crash the editor and/or the game. I'm really struggling to understand the rationale.

I think there are quite a few options we have, I mentioned deferring normal resolution until a hit is found. It seems like it would be amenable to some level of parallelization too. Not to mention switching to optimised colliders/LOD over full meshes, or any kind of culling.

Thoughts on an "unchecked_picking" feature flag that would avoid the checked indexing for people who are less risk averse?

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

As far as I've seen there hasn't been any optimisation done since merging, so I'm curious about the pushback.

I wrote the original impl in mod_raycast. This is in the hottest of hot loops. Any slowdown here is multiplied by any meshes that have their AABB hit. There have been 2ish perf improvements since this was merged. The wins are usually on the order of 5%-20% improvements, it's pretty hard to claw performance here.

Any bad asset or buggy procedural generation will crash the editor and/or the game. I'm really struggling to understand the rationale.

This isn't acceptable, I agree, but there should be a way to avoid the crashes without halving performance.

be amenable to some level of parallelization too. Not to mention switching to optimised colliders/LOD over full meshes, or any kind of culling.

The mesh picking backend already heavily parallelizes and culls using AABBs, that's the only way this is performant at all. At the end of the day, this isn't really a full solution for the editor, it's a stopgap until we have a physics engine integrated, and we can use real acceleration structures, that or a shader picking backend.

@tbillington
Copy link
Contributor Author

tbillington commented Oct 29, 2024

There have been 2ish perf improvements since this was merged.

I didn't look very hard then, woops!

Ah, ok good context :) might be cool if we added a picking stress test example so we have something more concrete for contributors to measure against. From my perspective all I see is microbench perf in an unreleased feature so it's hard to evaluate perf in any real sense.

Good to hear. I guess I was looking too closely at just these functions, didn't see the outer parallelization!

Agree on the physics engine bit!

@tbillington
Copy link
Contributor Author

I might have some time to poke around over the week. If you have any suggestions I could try them out, otherwise will just bash my head against it xD

@aevyrie
Copy link
Member

aevyrie commented Oct 29, 2024

From my perspective all I see is microbench perf in an unreleased feature so it's hard to evaluate perf in any real sense.

That is completely understandable. 🙂 It would be nice to add benchmarks that test the full raycast as well, but when this benchmark was made, there was not parallelization or culling, so it really was everything.

For the record, I think the tests you've added here are invaluable, and I don't mean to put a damper on things. Feel free to ignore nits about code brevity, as long as performance hasn't regressed, this should be merged.

@tbillington tbillington changed the title Add tests and remove sources of panics in mesh picking Remove panics and optimise mesh picking Nov 14, 2024
@tbillington
Copy link
Contributor Author

Any more thoughts @aevyrie ? If you prefer I could try and source another reviewer if you don't have the bandwidth available !

Copy link
Member

@aevyrie aevyrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this fell off my radar.

  • The benchmarks and tests are a straight upgrade, very appreciated, no notes.
  • The changes to the intersection code still gives me pause. The only thing I'd like to see before merging is testing this in a real scene, and the closest we have is the caldera hotel. I'd say that as long as we aren't regressing in this case, I'm happy to approve and merge.
    • In all likelihood, performance will be unchanged, but I'd really like to see a representative case tested before merging. This is a hot path and small tweaks can have surprising impacts to performance, so I'm being pretty cautious.
    • I'm happy to help with this, I can also try doing this myself. I think the easiest way to test this will be using the mesh picking backend, loading in the caldera hotel gltf, and running tracy, while mousing over the same spot in the scene.

Copy link
Member

@aevyrie aevyrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hooked up bevy's scene viewer, loaded caldera hotel, and compared main to your branch, and your branch seems to be faster!

main
image

PR:
image

I also verified raycasting was actually working:

image

@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jan 6, 2025
@tbillington
Copy link
Contributor Author

The changes to the intersection code still gives me pause. The only thing I'd like to see before merging is testing this in a real scene, and the closest we have is the caldera hotel. I'd say that as long as we aren't regressing in this case, I'm happy to approve and merge.

  • In all likelihood, performance will be unchanged, but I'd really like to see a representative case tested before merging. This is a hot path and small tweaks can have surprising impacts to performance, so I'm being pretty cautious.
  • I'm happy to help with this, I can also try doing this myself. I think the easiest way to test this will be using the mesh picking backend, loading in the caldera hotel gltf, and running tracy, while mousing over the same spot in the scene.

Yea, we could possibly create some kind of picking stress test, but the trouble is having realistic test data, so it's cool you could test with an actual scene.

@alice-i-cecile
Copy link
Member

@tbillington @BD103 I'd like to merge this; ping me when merge conflicts are resolved please :)

@tbillington
Copy link
Contributor Author

tbillington commented Jan 10, 2025

Not finished yet, but while converting the the overlapping_planes benchmarks the test was failing. Even though I initially thought about removing it as aevyrie suggested, I wanted to make sure the test at least passed so there was less chance of something funky from my refactor.

I tried for about an hour to figure out why, it seems moving the y position of the vertices of the plane would cause the ray to never intersect. I can't work out if something is being rotated somewhere, or something else is going on so I'm a little hesitant until I can figure out exactly what is wrong, whether it's just the silly mesh generation code I have or if it's actually something related to my picking refactor.

@tbillington
Copy link
Contributor Author

tbillington commented Jan 10, 2025

@BD103 I've covered your requests I'm pretty sure, though I stopped short of fully utilising your Benchmarks struct for abstracting even more since some of my benches added more variation in how things could work. I feel like I got most of the value from the helper macro_rules I wrote and passing things into the test functions, but open to suggestions!

The docs and refactor for p_to_x_norm()
The logarithmic scaling in the graphs
Correctly black_box()-ing all inputs
Testing that it does or does not intersect when cargo test --benches is run
The bench! macro naming

@BD103 BD103 added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it labels Jan 10, 2025
@BD103
Copy link
Member

BD103 commented Jan 10, 2025

Awesome! I'll do a thorough review of your changes later today, but I did notice some commented-out code. If Benchmarks doesn't fit your needs, feel free to discard it! I used it as a well to avoid duplicating code, but that was only because every benchmark tested the same thing with different values.

@BD103 BD103 self-requested a review January 10, 2025 13:13
@tbillington
Copy link
Contributor Author

Yea it was more of a work in progress commit to get y of ur eyes on it. But yes fortifying commented out stuff is next

@BD103
Copy link
Member

BD103 commented Jan 14, 2025

@tbillington In order to not hold up this PR, could you split the benchmark changes into a separate pull request? That way the actual optimizations can be merged while we work on the benchmarks separately.

@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Jan 14, 2025
@alice-i-cecile
Copy link
Member

@tbillington gentle ping on this. I'd like to get this into 0.16, and I think it's largely ready.

@alice-i-cecile alice-i-cecile removed this from the 0.16 milestone Mar 9, 2025
@alice-i-cecile alice-i-cecile added S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Mar 9, 2025
@alice-i-cecile
Copy link
Member

Cutting from the milestone and putting this up for adoption for now: it's been a couple of weeks without a response but I really do think that this work is valuable.

@alice-i-cecile
Copy link
Member

Closing as mostly adopted. The benchmark changes are still desired in the future!

github-merge-queue bot pushed a commit that referenced this pull request Mar 10, 2025
_Note from BD103: this PR was adopted from #16148. The majority of this
PR's description is copied from the original._

# Objective

Adds tests to cover various mesh picking cases and removes sources of
panics.

It should prevent users being able to trigger panics in `bevy_picking`
code via bad mesh data such as #15891, and is a follow up to my comments
in [#15800
(review)](#15800 (review)).

This is motivated by #15979

## Testing

Adds 8 new tests to cover `ray_mesh_intersection` code.

## Changes from original PR

I reverted the changes to the benchmarks, since that was the largest
factor blocking it merging. I'll open a follow-up issue so that those
benchmark changes can be implemented.

---------

Co-authored-by: Trent <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Picking Pointing at and selecting objects of all sorts C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Straightforward Simple bug fixes and API improvements, docs, test and examples S-Adopt-Me The original PR author has no intent to complete this work. Pick me up!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants