Skip to content

fix(workflow): serialize dynamic-node emit across concurrent children#1040

Open
wolo-lab wants to merge 2 commits into
v2from
wolo/v2-h5-concurrent-emit
Open

fix(workflow): serialize dynamic-node emit across concurrent children#1040
wolo-lab wants to merge 2 commits into
v2from
wolo/v2-h5-concurrent-emit

Conversation

@wolo-lab

@wolo-lab wolo-lab commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Problem

A dynamic node's body (DynamicFn) can launch several children at once and run them on separate goroutines.
Every child sends its events "up" to the parent through a single shared callback (emit, which wraps the parent's one yield function). That callback had no synchronization, so when two children fire at the same time they call the same yield concurrently — and two things break:

  1. Panic. Go's range-over-func (the streaming mechanism behind these events) does not allow the same iterator to be entered from two goroutines at once. It detects the overlap and panics (range function continued iteration after loop body panic).
  2. Data race. The parent's event pump (runNode) records the node's outcome on that same path; two goroutines writing it concurrently is a race (caught by -race).

Trigger: any DynamicFn that fans children out across goroutines (e.g. errgroup / WaitGroup), each emitting at least one event.

Solution

Serialize emission with a single per-activation mutex inside makeEmit. Every yield — whether from the DynamicFn's own emit or from a child via RunNode or the sub-scheduler — now goes through the same lock, so only one runs at a time. The lock is held only around the single yield call, so children still run concurrently; only the hand-off upstream is serialized.

@wolo-lab wolo-lab force-pushed the wolo/v2-h5-concurrent-emit branch from f55ea2a to c55da61 Compare June 15, 2026 20:59
@wolo-lab wolo-lab marked this pull request as ready for review June 15, 2026 21:07
A DynamicFn may run children concurrently (the documented
WithUseSubBranch pattern), and every child forwards its events up through
one shared emit callback (makeEmit, wrapping the parent's single yield).
With no synchronization, concurrent children call the same yield at once,
which panics the range-over-func iterator and races the parent runNode's
completion accumulator.

Guard emit with a per-activation mutex so all yields — from the
DynamicFn's own emit and from RunNode via the sub-scheduler — are
serialized. Add a -race regression test that fans children out across
goroutines; it panics/races on the unpatched code.
@wolo-lab wolo-lab force-pushed the wolo/v2-h5-concurrent-emit branch from c55da61 to 3ec8581 Compare June 15, 2026 21:07
@wolo-lab wolo-lab requested a review from dpasiukevich June 15, 2026 21:25
Comment thread workflow/dynamic_node.go Outdated
// passed to the DynamicFn and the same emit driven by RunNode via
// the sub-scheduler. Concurrent children must not yield at once.
var emitMu sync.Mutex
emit := makeEmit(yield, ctx, &emitMu)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it make sense to create mutex within makeEmit function? this will make a simpler func prototype.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excelant idea!

Address review nit: the per-activation mutex is owned solely by the
emit closure, so create it inside makeEmit instead of passing it in.
Simplifies the makeEmit prototype and removes the caller-side mutex
plumbing. No behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants