-
Notifications
You must be signed in to change notification settings - Fork 62
Fix regression issue in flex decoding. #3999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a regression issue in flex decoding by updating block pointer handling and prefetch operations. Key changes include:
- Updating the 2D prefetch shape calculation with a tensor shape bound.
- Replacing an unreachable branch in prefetch conversion with a regular pointer prefetch handler.
- Adjusting mask handling both in prefetch and load/store conversions.
Comments suppressed due to low confidence (1)
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp:565
- [nitpick] There are multiple debug print statements (e.g., 'johnlu debug:') in production-oriented code; consider removing or gating these under a debug flag to avoid cluttering output during normal execution.
llvm::outs() << "johnlu debug: operands " << (opIdx == DpasEncodingAttr::OpIdx::OperandA ? "A" : "B") << "\n";
82da241
to
73c3fbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add a unit test. Maybe to test_core or test_block_loads in the Intel directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a lit test (unit test) to cover this bug.
5b9df4f
to
d055a4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for now. I think I prefer a python test here so we can compare to some known good output, vs the lit test which can be hard to parameterize / generalize to lots of different masks. If you have ideas for improving test_block_load.py
to test this condition I would be very interested in hearing them!
cbf6610
to
1225835
Compare
Enhanced the LIT test case to cover the boundary check condition. |
I will add the unit test in another PR #3876 |
…nter. There are `maskElems` values if the `llMask` is null for BlockPointer case. Signed-off-by: Lu,Chengjun <[email protected]>
The
tt.store
operation with BlockPointer fallbacks to scatter store if the BLOCK shape or the value layout was not supported by the 2D BLOCK IO.The lowering code would transform the BlockPointer to the pointers and masks.
The scatter store should apply the
and
to themaskElems
if thellMask
doesn't exsits.