-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
lowering: avoid Core.Box for captured variables assigned in all if/elseif/else branches
#60542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2e3584f to
bd3ffdc
Compare
f1b1acb to
898eeec
Compare
…else branches
Enhance lambda-optimize-vars! to recognize when a captured variable is
assigned in all branches of an if-else or if-elseif-else statement. Such
variables are effectively single-assigned on each control flow path and
don't need boxing even though they appear to be assigned multiple times
syntactically.
This avoids unnecessary Core.Box allocations for common patterns like:
if cond1
x = a
elseif cond2
x = b
else
x = c
end
return () -> x
Co-Authored-By: Claude <[email protected]>
When a variable is used (read) before being assigned within a branch, it cannot safely have the never-undef optimization applied. Update mark-used to also check and remove ifa candidates in this case. Co-Authored-By: Claude <[email protected]>
Address several edge cases in the if-else branch optimization for captured variables: - Add `ifa-assigned` table to track assignments independently of `kill` (which clears `live` on labels/gotos) - Check `has-symbolic-goto` after visiting branches to disable optimization when any @goto is present (could skip assignments) - Add `ifa-used-first` table to detect use-before-assign in any branch - Add `ifa-promoted` table to prevent `mark-captured` from incorrectly removing promoted variables from `unused` Co-Authored-By: Claude <[email protected]>
898eeec to
6078029
Compare
|
Not that I'd expect much, but it doesn't seem to have much of a performance improvement wrt. CI timing. Comparison of this build against recent timing |
|
I'm a big fan of this improvement, but the test coverage is a little thin TBH Ideally, a change like this would have tests for correctness (not optimizing when it's unsafe) in addition to the tests you added for precision (optimizing when it is safe). I think this would benefit from more test coverage or (very) thorough review to make sure the approach here is sound. |
|
I'm not sure this makes sense. Could you maybe explain how the added unboxing logic works? My concern is that throughout Easy counterexample: |
Distinguish between nested if-else (where inner promotion should be kept) and sequential if-else blocks (where a second if-else assigning the same variable should undo the promotion). Also handle loops by reverting any promotions that occurred inside the loop body. Add comprehensive tests for nested if-else, loops, and the sequential if-else counterexample. Co-authored-by: Claude <[email protected]>
|
Thanks for the reviews! The tests are now expanded, and claude seems to have figured out getting them to pass. Perhaps the main value here is developing out and agreeing on the tests. |
|
I worry that if I provide another counterexample, Claude will produce an even more convoluted "analysis" that hacks around the test |
|
Yeah that fails. I'll add it as a failing test and leave the PR where it is |
|
here is another in particular, I would think this optimization is only possible if the variable is never captured again within one of the branches before assignment |
…-region Rewrite the if-else optimization to use a cleaner approach: only optimize when the if-else block is the variable's single definition region. The optimization applies when: 1. No assignments to var outside the if-else (before or after) 2. No use of var before assignment in any branch 3. Exactly one assignment per branch (all branches must assign) 4. Capture occurs after the if-else completes 5. No loops inside branches (which would create multiple assignments) This treats the if-else as a single φ-node (single definition point). Implementation uses these tracking tables: - ifa-candidates: captured multi-assigned vars (potential optimization targets) - assigned-outside: vars assigned outside any qualifying if-else - captured-early: vars captured before their if-else completes - ifa-completed: vars that completed a valid if-else (all branches assigned) - in-ifa-for: during if-else visit, which vars we're tracking - used-before-assign: vars used before assigned in some branch During visit: - When entering if-else, mark eligible candidates in in-ifa-for - Track assignments per branch; intersect to find vars assigned in ALL branches - Track use-before-assign within branches - Clear in-ifa-for when entering loops (assignments in loops are "outside") - Mark vars as ifa-completed if assigned in all branches, not captured early, not used-before-assign At end, promote ifa-completed vars that aren't assigned-outside or captured-early. Co-authored-by: Claude <[email protected]>
|
Thanks. I added that test too. I also got claude to review the tests and reviews here, and come up with a better strategy. Here's the summary doc it made, and strategy D is now implemented, and passes tests locally: Box Avoidance Optimization StrategiesProblem StatementWhen a variable is captured by a closure and assigned multiple times, Julia allocates a Test CasesOptimized cases (should NOT box)
Boxed cases - incomplete branch coverage
Boxed cases - use/capture timing
Boxed cases - control flow
Boxed cases - loops
Boxed cases - multiple assignments
StrategiesStrategy A: Original Attempt (Complex State Tracking)Track multiple tables ( Pros: Attempts to handle many cases
Strategy B: Only-Assignment-Site (Strict)Rule: Only optimize if the if-else block contains the ONLY assignments to the variable in the entire function. Check: Implementation:
Pros: Simple to understand, conservative Strategy C: Capture-After-Only-If-Else (Very Strict)Rule: Only optimize if:
Implementation:
Pros: Very conservative, likely correct Strategy D: Single-Definition-Region (Implemented)Rule: The if-else block must be the "single definition region" for the variable:
This treats the if-else as a single φ-node (single definition point). Implementation:
During visit:
At end:
Final ImplementationStrategy D was implemented. Key aspects:
The key insight: we're not trying to do general dataflow analysis. We're identifying a specific pattern where an if-else block IS the variable's only definition, making it effectively single-assigned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really convinced by this iteration-via-Claude-via-third-party approach to solving this problem.
This algorithm is still quite limited, as demonstrated by the fact that it handles:
if cond
x = 1
else
x = 2
end
return ()->xbut not:
x = 2
if cond
x = 1
end
return ()->xIn fact many of the added tests are examples of cases that would be supported by a more complete control-flow analysis, not of required safety or good behavior. Having several serious bugs found so far does not increase my confidence.
Here is (yet) another one:
function ifa_basic(cond, a, b)
local x
while true
if cond
x = a
else
break
x = b
end
break
end
return ()->x
endWith your change ifa_basic(false, 1, 2) throws UndefVarError because lowering believes it to be always-assigned and eligible to be unboxed.
I think we probably need to go a different direction here. This approach introduces a substantial amount of ad-hoc control flow analysis to closure conversion, and I don't think the extra complexity justifies such a specific optimization, especially given the bugs so far.
| (set! branch-assigned (table)) | ||
| (let ((elseif-branch-tbl (table))) | ||
| (let visit-elseif-then ((e2 (caddr expr))) | ||
| (cond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much of this code is heavily duplicated above - it could use re-factoring
| end | ||
| return () -> x | ||
| end | ||
| @test fieldtype(typeof(box_ifelse_inside_loop(3, 1, 2)), 1) === Core.Box |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another imprecise case
| end | ||
| return () -> x | ||
| end | ||
| @test fieldtype(typeof(box_while_loop_in_branch(true, 1, 2)), 1) === Core.Box |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this is testing that the analysis is imprecise (does not optimize when it would be legal to)
| x = 0 | ||
| return () -> x | ||
| end | ||
| @test fieldtype(typeof(box_goto_skips_assign(true, 1, 2)), 1) === Core.Box |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another imprecise case
| else | ||
| x = b | ||
| end | ||
| return () -> x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another imprecise case
| end | ||
| return () -> x | ||
| end | ||
| @test fieldtype(typeof(box_missing_else(true, 1)), 1) === Core.Box |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many (most) of these are situations where the optimization is actually valid, but the algorithm has decided not to optimize anyway.
Useful tests need to cover cases where the optimization is actually unsafe to apply.
These are just asserting that the algorithm is imprecise, which is not useful.
|
while we're throwing AI at it, here's three more failure modes gemini helped me get: |
|
Ok. I think I'm not being effective in trying to help this along. Sorry for the noise. |
An attempt at doing the
Core.Boxavoidance mentioned in #60479 (comment) automatically @KristofferCDeveloped with Claude:
Variables that are captured by a closure and assigned in all branches of an
if/elseif/else statement are now treated as effectively single-assigned,
avoiding unnecessary Core.Box heap allocations.
The optimization is disabled when:
@gotois present (could skip assignments)