Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drmemtrace scheduler turns all-blocked runqueue into fatal STATUS_IDLE error #7318

Closed
derekbruening opened this issue Mar 4, 2025 · 0 comments · Fixed by #7323
Closed

drmemtrace scheduler turns all-blocked runqueue into fatal STATUS_IDLE error #7318

derekbruening opened this issue Mar 4, 2025 · 0 comments · Fixed by #7323

Comments

@derekbruening
Copy link
Contributor

If during a rebalance an output's runqueue has only blocked inputs, it will return STATUS_IDLE, which the scheduler propagates as a fatal error even though it should be innocuous. It shows up as this message:

[scheduler] Failed to rebalance with status 8
@derekbruening derekbruening self-assigned this Mar 4, 2025
derekbruening added a commit that referenced this issue Mar 4, 2025
Rebalancing was propagating an idle status from a runqueue to the
caller, where it was treated as an error when it is in fact innocuous.

Adds a unit test that reproduces the error without the fix and passes
with the fix.

Fixes #7318
derekbruening added a commit that referenced this issue Mar 4, 2025
Rebalancing was propagating an idle status from a runqueue to the
caller, where it was treated as an error when it is in fact innocuous.

Adds a unit test that reproduces the error without the fix and passes
with the fix.

Fixes #7318
derekbruening added a commit that referenced this issue Mar 12, 2025
If an input is bound to every output, we now ignore those bindings,
which eliminates complexities in initial output allocation where
rebalancing is required to even things out (and that gets complicated
by initially-blocked inputs) as well as removes overhead in the
runqueue code during dynamic scheduling. Binding to every output is
actually not uncommon as users have such bindings as a default value.

As part of detecting bind-to-all, invalid bindings are now detected.

Adds unit tests for the invalid bindings.

Updates the unbalanced rebalance test to bind to all but one as
binding to all is no longer unbalanced.

Issue: #7318
derekbruening added a commit that referenced this issue Mar 13, 2025
If an input is bound to every output, we now ignore those bindings,
which eliminates complexities in initial output allocation where
rebalancing is required to even things out (and that gets complicated by
initially-blocked inputs) as well as removes overhead in the runqueue
code during dynamic scheduling. Binding to every output is actually not
uncommon as users have such bindings as a default value.

As part of detecting bind-to-all, invalid bindings are now detected.

Adds unit tests for the invalid bindings.

Updates the unbalanced rebalance test to bind to all but one as binding
to all is no longer unbalanced.

Issue: #7318
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant