Skip to content

feat(container-runtime): add groupedOpCount to grouped batch metadata#27213

Merged
anthony-murphy merged 15 commits intomicrosoft:mainfrom
anthony-murphy:users/anthonm/grouped-batch-op-count
May 6, 2026
Merged

feat(container-runtime): add groupedOpCount to grouped batch metadata#27213
anthony-murphy merged 15 commits intomicrosoft:mainfrom
anthony-murphy:users/anthonm/grouped-batch-op-count

Conversation

@anthony-murphy
Copy link
Copy Markdown
Contributor

@anthony-murphy anthony-murphy commented Apr 30, 2026

Description

Adds a groupedOpCount field to the metadata of grouped batch envelope messages, exposing the number of inner ops in the batch on the wire. This lets wire-level consumers (Word telemetry in particular) record batch sizes without having to parse the grouped batch contents.

Fixes AB#71749.

The field is set on:

  • The single envelope produced by grouping a batch (OpGroupingManager.groupBatch).
  • The empty-grouped-batch placeholder used when a resubmitted batch becomes empty (OpGroupingManager.createEmptyGroupedBatch, value 0).
  • The final chunk of a chunked grouped batch (OpSplitter.splitSingletonBatchMessage), so the count survives chunking. Intermediate chunks remain unchanged, mirroring how the existing batch flag is preserved only on the final chunk.

Compression preserves message metadata as-is, so compressed (non-chunked) grouped batches carry the field through unchanged with no additional code.

IBatchMetadata is internal to container-runtime and not in any API report, so this is not a public API change.

Reviewer Guidance

The review process is outlined on this wiki page.

  • The wire-format addition is a single optional field on the outer envelope's metadata. It is only read by external telemetry consumers — the runtime itself does not act on it, so behavior is unchanged.
  • The OpGroupingManager unit tests assert the new field on both the grouped envelope and the empty-grouped-batch placeholder. opSplitter.spec.ts has a new Propagates groupedOpCount onto the last chunk only test asserting intermediate chunks carry no metadata, the last chunk's envelope carries { batch, groupedOpCount }, and reassembly via processChunk restores the field on the reconstructed message.

Expose the inner op count on the metadata of grouped batch envelope
messages so wire-level consumers (e.g. Word telemetry) can record
batch sizes without parsing the grouped batch contents.

Set on:
- the envelope produced by groupBatch
- the empty-grouped-batch placeholder (value 0)
- the final chunk of a chunked grouped batch, so the count survives
  chunking. Intermediate chunks remain unchanged.

Compression preserves metadata, so compressed grouped batches carry
the field through unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (39 lines, 5 files), I've queued these reviewers:

  • Correctness — logic errors, race conditions, lifecycle issues
  • Security — vulnerabilities, secret exposure, injection
  • API Compatibility — breaking changes, release tags, type design
  • Performance — algorithmic regressions, memory leaks
  • Testing — coverage gaps, hollow tests

Toggle the reviewer checkboxes above to adjust, then tick the box below to start:

  • Start review

Comment thread packages/runtime/container-runtime/src/opLifecycle/opSplitter.ts Outdated
anthony-murphy and others added 5 commits April 30, 2026 15:03
- Add splitter unit test asserting groupedOpCount lands only on the
  last chunk and survives reassembly via originalMetadata.
- Inline the metadata read in splitSingletonBatchMessage to match how
  the batch flag is read; the typeof guard was unnecessary since the
  metadata field is already typed as Record<string, unknown>.
- Extend IBatchMetadata.groupedOpCount JSDoc to enumerate observable
  values (absent / 0 / N).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Escape `>` in `(N > 0)` to satisfy tsdoc-escape-greater-than, and
unwrap the continuation line to satisfy jsdoc/check-indentation.
Stashed-state back-compat: optional/additive field, safely ignored by
old runtimes; new runtimes are unaffected by old stashes.

Advisory-only: consumed by off-runtime telemetry; the runtime does not
validate inbound `groupedOpCount` against the actual inner op count.
Comment thread packages/runtime/container-runtime/src/metadata.ts
Comment thread packages/runtime/container-runtime/src/opLifecycle/opSplitter.ts
anthony-murphy and others added 3 commits May 1, 2026 12:48
Add a one-line `groupedOpCount` round-trip assertion in opCompressor.spec.ts
mirroring the existing `flag` metadata pattern, so a future OpCompressor
refactor that drops generic metadata is caught by the test suite.

Document on IBatchMetadata.groupedOpCount that the field is advisory-only:
the runtime does not validate inbound values against actual inner op count;
it is consumed exclusively by off-runtime telemetry.
Replace the splitter comment's "mirroring batch flag" hand-wave with the
real reason for last-chunk-only stamping: chunks carry payload parts, not
ops, so per-chunk stamping would let observers double-count.

Tighten the IBatchMetadata JSDoc to capture the stronger invariant — the
field is always recomputed at outbound time (groupBatch reads
batch.messages.length, createEmptyGroupedBatch always writes 0, the
chunked path reads from a freshly-grouped envelope), never propagated
from stashed pending state. Resubmitted batches with squashed/dropped/
added ops produce a fresh count matching the actual outbound size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy anthony-murphy marked this pull request as ready for review May 4, 2026 20:37
Copilot AI review requested due to automatic review settings May 4, 2026 20:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new optional groupedOpCount wire-metadata field for grouped batch envelopes so downstream telemetry can observe grouped batch sizes without parsing grouped contents.

Changes:

  • Stamp groupedOpCount on grouped batch envelopes and empty grouped-batch placeholders in OpGroupingManager.
  • Preserve groupedOpCount on the final chunk when a grouped batch is chunked in OpSplitter.
  • Add/update tests covering grouping, chunking/reassembly, and compression metadata preservation.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/runtime/container-runtime/src/test/opLifecycle/opSplitter.spec.ts Adds coverage for preserving groupedOpCount on the last chunk and after reassembly.
packages/runtime/container-runtime/src/test/opLifecycle/opCompressor.spec.ts Verifies compression keeps groupedOpCount in message metadata.
packages/runtime/container-runtime/src/test/opLifecycle/OpGroupingManager.spec.ts Updates grouping tests to assert groupedOpCount on grouped and empty grouped batches.
packages/runtime/container-runtime/src/opLifecycle/opSplitter.ts Copies groupedOpCount onto the final chunk’s envelope metadata.
packages/runtime/container-runtime/src/opLifecycle/opGroupingManager.ts Stamps groupedOpCount when creating grouped and empty grouped batch metadata.
packages/runtime/container-runtime/src/metadata.ts Documents and types the new groupedOpCount metadata field.

Comment thread packages/runtime/container-runtime/src/metadata.ts Outdated
Comment thread packages/runtime/container-runtime/src/metadata.ts
anthony-murphy and others added 3 commits May 4, 2026 15:38
Old runtimes that predate this field also produce grouped batch envelopes
with absent groupedOpCount, so absence cannot be read as definitively
"not a grouped batch envelope." Telemetry consumers should treat absence
as ambiguous until rollout is complete and fall back to parsing envelope
contents if a precise count is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/runtime/container-runtime/src/opLifecycle/opSplitter.ts
anthony-murphy and others added 2 commits May 5, 2026 23:36
The chunked last-chunk envelope carries { batch, groupedOpCount } but not
batchId. With groupedOpCount now stamped, the prior "passively missing"
omission becomes a load-bearing selective-forwarding choice. batchId is
restored on the reassembled message via originalMetadata for runtime
dedup; no wire-level consumer reads it pre-reassembly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Deep Review

Reviewed commit 6577208 on 2026-05-06.

Readiness: 10/10 — READY

Ready for sign-off. Small, well-scoped additive change exposing groupedOpCount?: number on the runtime-internal IBatchMetadata, stamped at the grouped envelope, the empty-grouped-batch placeholder, and the last chunk of a chunked grouped batch, and preserved through compression. The last gating thread on opSplitter.ts:184 (chunked envelope forwards groupedOpCount but drops batchId) closed in 6577208cea with a 3-line comment at opSplitter.ts:180-182 pinning the omission as intentional — batchId is a runtime dedup field consumed only after processChunk restores originalMetadata, not by wire observers — and the existing Propagates groupedOpCount onto the last chunk only test deep-equals the last chunk's metadata to { batch: true, groupedOpCount }, which would catch a future accidental batchId addition. No author actions remain; outstanding items are area-owner sign-offs and one external-consumer confirmation listed below.

Context for Reviewers

For human reviewer
  • @markfields sign-off on (a) the empty-grouped-batch placeholder change at opGroupingManager.ts:103 (he authored fix(Offline): Applying stashed ops with empty batches #24779 fixing the same path's stashed-state regression and Offline: Add batchId to batch metadata on resubmit #21767/Send empty group batch on resubmit #21854's batchId design that the new opSplitter.ts:180-182 comment defers to) and (b) confirming the chunked-envelope { batch, groupedOpCount }-but-no-batchId shape is the right wire contract for service/observer consumers (resolved thread on opSplitter.ts:184, comments 3196431025/3196434785/3196435159).
  • @vladsud sign-off on adding groupedOpCount to envelope metadata as the first deliberately advisory, non-service-validated field on IBatchMetadata — a departure from the Allow batches to group ops together #14512-era "service-inspectable only" rule. Author has declined an in-source "first advisory field" comment (closed thread on metadata.ts:50, comment 3193249441), so this judgment lives entirely on the human reviewer.
  • @vladsud / @markfields design call on whether the singleton grouped fast path should also stamp groupedOpCount: 1 so that "absent" eventually becomes unambiguous post-rollout. Author's documented position is that the field is scoped to grouped envelopes by name and the JSDoc adequately documents the singleton-bypass case.
  • @andre4i optional ack on extending the last-chunk-only metadata convention from When enabled, chunk compressed batches larger than 1MB #13329 to a third metadata field (groupedOpCount) plus the batchId-yes-on-non-chunked / no-on-chunked-envelope asymmetry now documented at opSplitter.ts:180-182.
  • Confirm with the AB#71749 / Word telemetry consumer that observing groupedOpCount only on the last chunk envelope (not every chunk, and absent on singleton-bypass envelopes) is acceptable for their ingestion point.
  • Cannot be assessed by the pipeline — whether host-side telemetry consumers read from the wire envelope vs. the reassembled message at a point where last-chunk-only placement is observable; whether the structural "always recompute on outbound" invariant for the empty-grouped-batch placeholder continues to hold under future Outbox / staging-mode refactors.
Review history (7 prior reviews)
  • 8573fb6 2026-05-05 · 9/10 — chunked-envelope batchId-omission concern was the sole gate
  • 81b6231 2026-05-05 · 8/10 — advisory-only-field gate plus new chunked-envelope batchId-consistency thread
  • 07f401b 2026-05-05 · 8/10 — advisory-only-field precedent remained as sole gate
  • 6e2d057 2026-05-04 · 8/10 — JSDoc "Absent" wording tightened; advisory-only-field polish remained as sole gate
  • 17edd87 2026-05-04 · 8/10 — resubmit/recompute concern resolved via JSDoc invariant; advisory-only-field polish remained
  • ceefa6a 2026-05-04 · 8/10 — splitter-test gap resolved; two missing-test polish items remained, both mitigated by structural arguments
  • 371a96c 2026-05-04 · 8/10 — initial review; advisory-only field aligns with existing precedents, only low-confidence polish items remained

@anthony-murphy anthony-murphy merged commit 7f1371f into microsoft:main May 6, 2026
37 checks passed
@anthony-murphy anthony-murphy deleted the users/anthonm/grouped-batch-op-count branch May 6, 2026 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants