-
Notifications
You must be signed in to change notification settings - Fork 922
Fix frame registration/deregistration for libgcc and libunwind #5886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: zebreus <[email protected]>
|
@copilot continue and add a test |
Co-authored-by: zebreus <[email protected]>
Added two tests for thread safety in commit 0ad7a3b:
Both tests pass successfully. |
|
@copilot That is the wrong fix. The issue is about unregister being called when exiting. look into what gcc itself does during exiting and how that interacts with our code |
The previous fix using a mutex was incorrect. The real issue is that we should not call __deregister_frame in Drop at all, as it races with libgcc's own destructor (release_registered_frames) which runs during program shutdown. When libgcc's destructor runs, it sets in_shutdown=true and cleans up all registered frames. If our Drop runs after this and tries to deregister frames, it triggers gcc_assert(in_shutdown || ob) which aborts when the frame is not found. The solution is to rely on libgcc's automatic cleanup and not manually deregister frames. This is safe because: - During normal operation, frames remain registered (no leak as they're in libgcc's global data structures) - During shutdown, libgcc's destructor cleans everything up - macOS ARM64 still needs manual cleanup as it uses a different mechanism Co-authored-by: zebreus <[email protected]>
You're absolutely right. I investigated the GCC libgcc implementation and found the real issue. The problem is that GCC has a destructor function If our Drop runs after GCC's destructor, we try to deregister frames that no longer exist, triggering The correct fix (commit 7af3d1c): Remove The mutex approach was wrong - it was trying to serialize access but didn't address the fundamental ordering problem between our Drop and GCC's destructor. |
|
@copilot If we are using multiple unwind registries in the code, we need them to deregister frames when we drop them. That's the case with multiple wasm modules, but I am not certain. Could you make sure we unregister the frames when dropping a WASM module? |
The previous approach of not deregistering frames was incorrect for scenarios where multiple WASM modules are loaded and unloaded during runtime (not just at shutdown). Key changes: 1. For libgcc: Register the entire .eh_frame section ONCE (not individual CIEs). This matches LLVM's RTDyldMemoryManager implementation. 2. For libunwind: Continue registering individual FDEs (correct behavior) 3. Properly deregister frames in Drop to support module unloading 4. Add null pointer check to avoid issues if GCC's destructor runs first during shutdown This fix ensures: - Multiple modules can be loaded/unloaded without frame registry leaks - Deregistration works correctly for both libgcc and libunwind - Race condition with GCC's shutdown destructor is mitigated by the null pointer check Co-authored-by: zebreus <[email protected]>
You're right - we need deregistration for module unloading. The issue was that we were registering frames incorrectly for libgcc. Fixed in commit 7bce412:
This ensures modules properly clean up their frames when dropped while avoiding the crash during shutdown. |
|
Closing as we're working on that in #5893. |
Description
This PR addresses a segfault that occurs occasionally when running wasmer with the LLVM compiler. The issue involves incorrect frame registration for libgcc and the need to properly support frame deregistration for multiple WASM module scenarios.
Root Cause
There were two issues:
Incorrect registration for libgcc: The original code was registering individual CIEs for libgcc systems, but according to LLVM's RTDyldMemoryManager.cpp and GCC's implementation, libgcc's
__register_frameexpects a pointer to the START of the entire.eh_framesection (which it scans until it finds a terminator), not individual records.Need for deregistration: When multiple WASM modules are loaded and unloaded during runtime (not just at shutdown), frames must be properly deregistered to avoid registry leaks. Simply relying on GCC's shutdown destructor is insufficient for this use case.
Shutdown race condition: There is a potential race condition where our Drop implementation could run after GCC's
release_registered_framesdestructor (marked with__attribute__((destructor))) has already cleaned up during program shutdown, triggeringgcc_assert(in_shutdown || ob).Changes Made
.eh_framesection pointer ONCE to__register_frame, matching LLVM's implementationtest_unwind_registry_drop_does_not_panic: Verifies basic Drop behaviortest_multiple_registries_can_be_dropped: Verifies sequential drops work correctly (simulates multiple module loading/unloading)test_concurrent_registry_drops: Verifies concurrent drops don't cause issuesWhy This Fix Is Correct
if *registration != 0) prevents crashes if GCC's destructor runs first during shutdown.eh_framesection pointer for libgcc ensures proper unwinding behaviorTesting
--features=llvm--features=compilerOriginal prompt
This section details on the original issue you should resolve
<issue_title>cargo r --features=llvm -p wasmer-cli run --disable-cache cowsay aa - crashes occasionally</issue_title>
<issue_description>If run the command from the current
mainbranch, sometimes I hit the following segfault: