fix(workflow): serialize dynamic-node emit across concurrent children#1040
Open
wolo-lab wants to merge 2 commits into
Open
fix(workflow): serialize dynamic-node emit across concurrent children#1040wolo-lab wants to merge 2 commits into
wolo-lab wants to merge 2 commits into
Conversation
f55ea2a to
c55da61
Compare
A DynamicFn may run children concurrently (the documented WithUseSubBranch pattern), and every child forwards its events up through one shared emit callback (makeEmit, wrapping the parent's single yield). With no synchronization, concurrent children call the same yield at once, which panics the range-over-func iterator and races the parent runNode's completion accumulator. Guard emit with a per-activation mutex so all yields — from the DynamicFn's own emit and from RunNode via the sub-scheduler — are serialized. Add a -race regression test that fans children out across goroutines; it panics/races on the unpatched code.
c55da61 to
3ec8581
Compare
dpasiukevich
approved these changes
Jun 16, 2026
| // passed to the DynamicFn and the same emit driven by RunNode via | ||
| // the sub-scheduler. Concurrent children must not yield at once. | ||
| var emitMu sync.Mutex | ||
| emit := makeEmit(yield, ctx, &emitMu) |
Collaborator
There was a problem hiding this comment.
nit: would it make sense to create mutex within makeEmit function? this will make a simpler func prototype.
Contributor
Author
There was a problem hiding this comment.
That's an excelant idea!
Address review nit: the per-activation mutex is owned solely by the emit closure, so create it inside makeEmit instead of passing it in. Simplifies the makeEmit prototype and removes the caller-side mutex plumbing. No behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A dynamic node's body (
DynamicFn) can launch several children at once and run them on separate goroutines.Every child sends its events "up" to the parent through a single shared callback (
emit, which wraps the parent's oneyieldfunction). That callback had no synchronization, so when two children fire at the same time they call the sameyieldconcurrently — and two things break:range function continued iteration after loop body panic).runNode) records the node's outcome on that same path; two goroutines writing it concurrently is a race (caught by-race).Trigger: any
DynamicFnthat fans children out across goroutines (e.g.errgroup/WaitGroup), each emitting at least one event.Solution
Serialize emission with a single per-activation mutex inside
makeEmit. Everyyield— whether from theDynamicFn's ownemitor from a child viaRunNodeor the sub-scheduler — now goes through the same lock, so only one runs at a time. The lock is held only around the singleyieldcall, so children still run concurrently; only the hand-off upstream is serialized.