fix(observability): fix panic in allocation tracing dealloc path#25136
fix(observability): fix panic in allocation tracing dealloc path#25136
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c7205b2360
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/internal_telemetry/allocations/allocator/tracing_allocator.rs
Outdated
Show resolved
Hide resolved
565e816 to
37e58e6
Compare
37e58e6 to
5360442
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5360442631
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… pre-tracking memory When `--allocation-tracing` is enabled at runtime, the custom allocator wraps every allocation with an extra byte to store the allocation group ID. Previously, allocations made before tracking was enabled used the original (unwrapped) layout. When those were later freed, `dealloc` read an out-of-bounds byte as the group ID, hitting `NonZeroU8::new_unchecked(0)` -- undefined behavior that recent Rust toolchains (>= ~1.78) turn into an abort in debug builds. Additionally, reentrant allocations (wrapped layout but tracing closure skipped) left the group ID header uninitialized, causing misattributed deallocations and skewed per-group memory accounting. Fix: always allocate with the wrapped layout, regardless of whether tracking is currently enabled. The group ID header byte is set to: - UNTRACKED (0): tracking was off at allocation time - UNTRACED (u8::MAX): tracking was on but the tracing closure was skipped due to reentrancy - A real group ID (1..254): normal traced allocation On deallocation, all paths free with the wrapped layout (which is always correct now). UNTRACKED and UNTRACED skip `trace_deallocation` to keep per-group accounting balanced. This eliminates: - The original panic/UB from `NonZero::new_unchecked(0)` - Layout mismatches for pre-tracking allocations (including realloc) - Accounting skew from uninitialized headers on reentrant allocations This bug has been latent since #15221 (Nov 2022) which introduced the runtime toggle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5360442 to
d3e4fe4
Compare
…ing-dealloc-panic
Harden test coverage based on code review feedback: - Assert exact ROOT group ID and trace counters in tracked alloc test - Add end-to-end reentrant allocation test that exercises the real reentrancy path via thread-local borrow - Remove redundant manual-sentinel test now covered by reentrant test - Rename sentinel collision test to reflect what it actually checks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@codex review |
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Fixes a panic (abort) in the
vector-alloc-processorthread when--allocation-tracingis enabled:Root cause: The custom allocator wraps every allocation with an extra byte to store the allocation group ID. Previously, allocations made before tracking was enabled used the original (unwrapped) layout. When those were later freed,
deallocread an out-of-bounds byte as the group ID, hittingNonZeroU8::new_unchecked(0)-- undefined behavior that recent Rust toolchains (>= ~1.78) turn into an abort in debug builds.Additionally, reentrant allocations (wrapped layout but tracing closure skipped due to
RefCellborrow) left the group ID header uninitialized, causing misattributed deallocations and skewed per-group memory accounting.Fix: Always allocate with the wrapped layout, regardless of whether tracking is currently enabled. The group ID header byte is set to:
UNTRACKED(0): tracking was off at allocation timeUNTRACED(u8::MAX): tracking was on but the tracing closure was skipped due to reentrancyOn deallocation, all paths free with the wrapped layout (always correct now).
UNTRACKEDandUNTRACEDskiptrace_deallocationto keep per-group accounting balanced.This eliminates:
NonZero::new_unchecked(0)realloc)History: This bug has been latent since #15221 (Nov 2022) which introduced the runtime toggle. It was never caught because release builds don't enable UB precondition checks, and debug builds on older Rust silently allowed the invalid
NonZeroU8(0).Vector configuration
Any configuration with
--allocation-tracingflag orALLOCATION_TRACING=trueenv var.How did you test this PR?
cargo run -- -c <config.yml> --allocation-tracingno longer panicsmake check-clippyandmake check-fmtpassChange Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
🤖 Generated with Claude Code