Skip to content

[CodeGen] Limit mem ops checks count for reasonable compilation speed #147151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ivafanas
Copy link
Contributor

@ivafanas ivafanas commented Jul 5, 2025

We've got a ~5 hours compilation of HistogramGIFFTMap.cpp file from Firefox project for the custom out-of-tree backend. Time-consuming part is an analysis of thounsand of memory instructions with thousands of memory operands each.

Proposed fix is to limit checks count for memory operands where it is possible to fallback to conservative answer. After fix applied compilation takes ~0.3 sec.

Details:

It happens in huge switch construction with ~1000 cases. The root cause is an interaction of BranchFolder optimization called inside post-ra IfCovnerter pass and MachineBlockPlacement pass:

  1. BranchFolder extracts identical store instruction into block from its predecessors (~ 1000 predecessors).
  2. Memory operands are united for extracted store instructions. So, MIR contains 1 block with store instructions, each one contains ~1000 memory operands.
  3. MachineBlockPlacement pass makes a decision to tail merge such instructions back into predecessors. So, MIR contains ~1000 blocks with store instructions, each one contains ~1000 memory operands.

After that, analysis of memory instructions becomes really time-consuming.

In MIR it looks like the following.

MIR before IfConverter:

bb.2.sw.bb:
; predecessors: %bb.1
  successors: %bb.1019(0x80000000); %bb.1019(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 0
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i, align 4, !alias.scope !4)
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !4)
  IBRANCH %bb.1019

bb.3.sw.bb1:
; predecessors: %bb.1
  successors: %bb.1019(0x80000000); %bb.1019(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2034, align 4, !alias.scope !7)
  STW $r1, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !7)
  IBRANCH %bb.1019

bb.4.sw.bb3:
; predecessors: %bb.1
  successors: %bb.1019(0x80000000); %bb.1019(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 13
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2035, align 4, !alias.scope !10)
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !10)
  IBRANCH %bb.1019

...

bb.1019.return:
; predecessors: %bb.1008, %bb.1007, %bb.1006, ... ; TOO MANY PREDECESSORS

  RETURN

After BranchFolder run from IfConverter:

bb.1.sw.bb:
; predecessors: %bb.0
  successors: %bb.109(0x80000000); %bb.109(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 0
  IBRANCH %bb.109

bb.2.sw.bb1:
; predecessors: %bb.0
  successors: %bb.1017(0x80000000); %bb.1017(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2034, align 4, !alias.scope !7)
  STW $r1, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !7)
  IBRANCH %bb.1017

bb.3.sw.bb3:
; predecessors: %bb.0
  successors: %bb.109(0x80000000); %bb.109(100.00%)
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 13
  IBRANCH %bb.109

...

bb.109.return:
; predecessors: %bb.1008, %bb.1007, %bb.1006, ... ; TOO MANY PREDECESSORS
  successors: %bb.1017(0x80000000); %bb.1017(100.00%)
  liveins: $r3, $dr0, $r1
   STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2140, align 4, !alias.scope !325), (store (s8) into %ir.mIsSome.i.i.i2289, align 4, !alias.scope !772), (store (s8) into %ir.mIsSome.i.i.i2288, align 4, !alias.scope !769), (store (s8) into %ir.mIsSome.i.i.i2287, align 4, !alias.scope !766), (store (s8) into %ir.mIsSome.i.i.i2286, align 4, !alias.scope !763), (store (s8) into %ir.mIsSome.i.i.i2285, align 4,  ... ; TOO MANY MEM OPERANDS
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !325), (store (s32) into %ir.agg.result, !alias.scope !772), (store (s32) into %ir.agg.result, !alias.scope !769), (store (s32) into %ir.agg.result, !alias.scope !766), (store (s32) into %ir.agg.result, !alias.scope !763), (store (s32) into %ir.agg.result, !alias.scope !760), (store (s32) into %ir.agg.result, !alias.scope !757), (store (s32) into %ir.agg.result, !alias.scope !754),   ... ; TOO MANY MEM OPERANDS
  RETURN

And after MachineBlockPlacement pass:


bb.17.sw.bb31:
; predecessors: %bb.0
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 360
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2140, align 4, !alias.scope !325), (store (s8) into %ir.mIsSome.i.i.i2289, align 4, !alias.scope !772), ... ; TOO MANY MEM OPERANDS
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !325), (store (s32) into %ir.agg.result, !alias.scope !772), ... ; TOO MANY MEM OPERANDS
  RETURN

bb.84.sw.bb165:
; predecessors: %bb.0
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 578
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2140, align 4, !alias.scope !325), (store (s8) into %ir.mIsSome.i.i.i2289, align 4, !alias.scope !772), ... ; TOO MANY MEM OPERANDS
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !325), (store (s32) into %ir.agg.result, !alias.scope !772) ... ; TOO MANY MEM OPERANDS
  RETURN

bb.85.sw.bb167:
; predecessors: %bb.0
  liveins: $dr0
  $r1 = ADDs 0, 1
  $r3 = ADDs 0, 581
  STB $r1, $dr0, 4 :: (store (s8) into %ir.mIsSome.i.i.i2140, align 4, !alias.scope !325), (store (s8) into %ir.mIsSome.i.i.i2289, align 4, !alias.scope !772), ... ; TOO MANY MEM OPERANDS
  STW $r3, $dr0, 0 :: (store (s32) into %ir.agg.result, !alias.scope !325), (store (s32) into %ir.agg.result, !alias.scope !772), ... ; TOO MANNY MEM OPERANDS
  RETURN

...

Seems like the issue is related only to backends which uses post-ra IfConverter pass. It affects PowerPC, Hexagon, SystemZ and AMDGPU.

We would like to share fix with community if it is ok.

@ivafanas
Copy link
Contributor Author

ivafanas commented Jul 5, 2025

Hi, @arsenm

Could you please review the PR ?

@ivafanas ivafanas force-pushed the dev/limit-linear-num-mem-operands-checks branch from 309efd2 to e777d37 Compare July 5, 2025 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants