Skip to content

Specify policy on panics in the Bevy repo. #15979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tbillington opened this issue Oct 18, 2024 · 8 comments
Open

Specify policy on panics in the Bevy repo. #15979

tbillington opened this issue Oct 18, 2024 · 8 comments
Labels
A-Cross-Cutting Impacts the entire engine S-Needs-Design This issue requires design work to think about how it would best be accomplished

Comments

@tbillington
Copy link
Contributor

What problem does this solve or what need does it fill?

Bevy currently does not have a policy or documentation on the usage of panics in Rust code.

Because unhandled errors (panics) in Bevy result in the termination of the application, they are very harmful to the user experience. A panic in Rust signals the application has entered into an unrecoverable error state, and application termination is the only viable course of action. This is reinforced in the Rust book.

The panic! macro signals that your program is in a state it can’t handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.

Panicing is especially sensitive in a library like Bevy since it takes away choice from the user about how to handle a particular situation. To Bevys credit, it often does give the user a choice, such as Query::get_single which returns a Result, as opposed to Query::single, which is good to see.

Unhandled error fallout

Since Bevy is based upon writing game logic in Rust at the same "level" as core engine code, and there is no separate runtime like engines such as Unity or Godot, unhandled errors due to either user error or engine library error are much more disruptive. With runtime based engine, an unhandled error in user code or much of the engine code will result in an exception being raised, and a error being logged, but otherwise execution of the game will continue relatively uninterrupted.

That resilience and gracefully handling of errors is incredibly useful to game developers. Not only are games often produced under time, budget, and team constraints, but they may also be worked on by large teams with people of varied skill sets that may not know all of the invariants that are supposed to be kept in game logic. Rapid prototyping, iteration, and changes late in development also prevent developers from producing perfect code that handles all edge cases. At their core, games are about delivering an experience for players playing on their own devices. As such, if not handled well, errors can be especially disruptive to the player experience, and ultimately to the games reception and reputation.

A crash in a Bevy game isn't just a stacktrace on the developers machine. It results in a negative experience for players, and possibly worse reviews and refunds, materially impacting the developer.

This being the case, the use of panics in Bevy should be avoided wherever possible, heavily scrutinised, and only used as a last result.

What solution would you like?

Codify a policy on the use of panics with the Bevy engine, preferable upheld by CI lints/rules.

These rules could be tighter/looser per crate, but that's just minor details at this point.

Additional context

Motivating experience

My motivation for creating this issue was a couple of recent PRs that had me concerned about the difference in expectations about the use of panics between my assumptions and current Bevy maintainer expectations.

Before continuing I want to stress that I see this as 100% an expectations and process thing, and not any kind of personal indictment! I greatly value the contributions of everyone involved and hope this just serves to clarify expectations between people contributing to Bevy, and (at least 1) user(s) of Bevy going forward :)

As part of implementing picking in Bevy, mesh picking was introduced in #15800. I saw in discord that more reviews were desired, and I wanted to help out by reviewing. As part of my review I encountered multiple sources of panics in the new code, which I mentioned in the review. As someone who had only done a handful of contributions/reviews to Bevy previously, I didn't want to come across too strong as I'm still learning what the Bevy code & review expectations are.

While one source of panics was removed as a result, a few remained. What surprised me was that only my casual review even mentioned the panics. My assumption was that introducing panics into the engine in such user interfacing code would be more controversial.

Where I became concerned that perhaps my view of the use of panics and that others was bigger than I thought was during a follow-up issue not 12 hours later, with code from one of Bevys own examples triggering a panic from the PR in a code path I hadn't even noticed or commented on. It reinforced that this new API & impl in Bevy was a very easy source of panics.

To my surprise, the fix was just to correct the example, not any of the offending engine code. Of course, the incorrect example should be fixed, and was. However the inaction on the panicing code even after seeing it being hit so soon after merging was what motivated me to create this issue to better understand and clarify the use of panics in the codebase.

If panics were so readily accepted in just the one PR I happened to review randomly, does that mean many more PRs introduce panics at this level I haven't seen? Of course I only have a small sample size, but that thought worried me quite a bit.

I want Bevy to be a rock solid foundation for me and others to build incredible experiences on, and that includes being robust against the realities of the game development process I mentioned earlier. I want to have confidence Bevy will handle anything I throw at it, and be the last source of worries for me as a game developer.

To that end, my personal expectation as a user of Bevy is that short of something catastrophic such as the user removing their graphics card mid-game, or me messing with internals via reflection, the engine should never take down the process if at all possible.

Again, I have a lot of respect for everyone contributing to Bevy. I just want this issue clarified to we can all be unified going forward without needing to repeatedly discuss details based on personal opinions in every issue/PR 💖

@tbillington tbillington added C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled labels Oct 18, 2024
@bushrat011899
Copy link
Contributor

I think this is an important piece of documentation and policy. From a no_std perspective, this is even more important, since you don't have access to catch_unwind from the std library to minimize the impact of a panicking system (for example).

I think at a minimum we should require Panics sections in the documentation of all public APIs (which is already mostly if not totally done!). I'm not sure if that's enforceable via a built-in Clippy lint, or even with the Bevy CLI. There is a no_panic crate, but it's very hard to work with in a complex codebase with many dependencies.

Perhaps we could deny the usage of public panicking functions within Bevy's crates? Or (more contentious) add a panic feature to access them?

@BD103
Copy link
Member

BD103 commented Oct 18, 2024

The Bevy Linter has a lint to deny panicking methods in Query and World. We can enforce whatever policy is decided on through this.

@BD103 BD103 added S-Needs-Design This issue requires design work to think about how it would best be accomplished A-Cross-Cutting Impacts the entire engine and removed C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled labels Oct 18, 2024
@Atlas16A
Copy link

What ever gets decided here, I would also like to add this to bevy_editor_prototypes for the sake of policy parity with bevy main.

@ghost
Copy link

ghost commented Oct 29, 2024

I'd much prefer a panic feature option, I'm avoiding Single and likely going to patch bevy in order to regain ability to have panics for missing resources or other such related logic bugs(perhaps it's still possible and I glossed over some details). I was frankly bothered when I encountered a log message saying a system hasn't run due to a missing resource. At least some option to catch these errors on the user side is required so they can make the choice, again I may have glossed over some details.

@tbillington
Copy link
Contributor Author

tbillington commented Nov 14, 2024

I'd say that's more of bevy_ecs's current default API rather than how bevy implements things internally (which was my original focus), but it is quite fundamental to how bevy works.

You raise a legitimate point that discovering missing system requirements at development time should be a well supported operation, and having a toggle that could be enabled during development to complain more loudly or even panic/debug_assert would help surface them.

I'm absolutely in favour of options like that and giving developers the choice and tooling that they need.

My greater point is that a shipped bevy game should be resilient to failures like these without forcing the behaviour of crashing. I think this will become especially important as development progresses, and you have a large codebase with many dependencies, ensuring every invariant of every dependency is perfectly upheld is not feasible ahead of time. My opinion is defaulting to panicing is unacceptable for a release/distribution build, but ultimately the game developer/user of bevy should have that choice.

@mockersf
Copy link
Member

The current policy is:

  • panic if an error can't be returned
  • have shorter panicking alternatives to functions that are often used and return errors

Unrelated to this issue that is about documenting the policy, but there are places where we could return errors instead of panics, those should be fixed, and there are plans to work on async panics (failed commands mostly)

github-merge-queue bot pushed a commit that referenced this issue Mar 10, 2025
_Note from BD103: this PR was adopted from #16148. The majority of this
PR's description is copied from the original._

# Objective

Adds tests to cover various mesh picking cases and removes sources of
panics.

It should prevent users being able to trigger panics in `bevy_picking`
code via bad mesh data such as #15891, and is a follow up to my comments
in [#15800
(review)](#15800 (review)).

This is motivated by #15979

## Testing

Adds 8 new tests to cover `ray_mesh_intersection` code.

## Changes from original PR

I reverted the changes to the benchmarks, since that was the largest
factor blocking it merging. I'll open a follow-up issue so that those
benchmark changes can be implemented.

---------

Co-authored-by: Trent <[email protected]>
@tbillington
Copy link
Contributor Author

tbillington commented Apr 21, 2025

Just saw this in the Godot FAQs.

Why does Godot not use exceptions?

We believe games should not crash, no matter what. If an unexpected situation happens, Godot will print an error (which can be traced even to script), but then it will try to recover as gracefully as possible and keep going.

I'm very much of the same opinion, as are the people I've asked who do gamedev professionally. I'm really struggling to understand the rationale for the panic-happy nature of Bevy.

Bevy has been the most crash-prone engine I've used, which is kinda nuts considering it's a modern, "from scratch" game engine in Rust. It makes it hard to take the rest of the engine seriously sometimes.

I'm empathetic that in Bevy we're writing user code with such a thin or non-existent boundary with the "engine", and the nature of Rust error handling makes panics the "easy" option, but I really think Bevy can and needs to do better here.

IMO there's two main aspects, APIs that make panicing the default/easy/obvious solution, and implementations that panic unnecessarily eg #15891 or #18252. I hope we can improve on both fronts.

@BD103
Copy link
Member

BD103 commented Apr 21, 2025

IMO there's two main aspects, APIs that make panicing the default/easy/obvious solution, and implementations that panic unnecessarily eg #15891 or #18252. I hope we can improve on both fronts.

I believe this has been significantly improved with unified error handling. Panicking is no longer the default for many popular types like Query (#18082). Looking over the API as it currently stands, are there any other places you believe should be given the same treatment? (Where are spots where we currently panic but should return a Result?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Cross-Cutting Impacts the entire engine S-Needs-Design This issue requires design work to think about how it would best be accomplished
Projects
None yet
Development

No branches or pull requests

5 participants