-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pointwise scheduler fails to validate reference tv #3513
Conversation
!test |
This reverts commit 7333806.
!test |
🤞 |
!test |
!test --diff-bench |
Ah, this actually can cause an issue. For example, suppose we pick a tensor as a reference that has a broadcast ID, and that broadcast ID comes from a fusion input tensor. Suppose that broadcast ID is also used by a pad op, generating a non-broadcast ID, and that non-broadcast ID is NOT included in the reference. More specifically:
Here, suppose we choose |
I was worried about the same thing and I was thinking about changing the But turns out we don't need that. You can look at the other example I added for pad (there's a typo in the comment on tv0, I'll fix that). It's very similar to yours I think the difference between this example and the original issue we had is due to |
!test --diff-bench |
!test --diff-bench |
errr... what's with CI 😭 |
!test --diff-bench |
My gut feeling, based on my naive understanding about transform propagation, if we actually have an The other scenario is also pretty interesting to me. #3576 NOTE for myself. I think I should go have a look at how transform propagation actually works to verify this for a piece of mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the fix.
!test --diff-bench |
!test --diff-bench |
Fixes: #3512 When picking reference tv, pointwise scheduler fails to validate that the transformation on reference tv can be safely propagated to all outputs in the fusion. The issue occurs when an IterDomain that's not in the reference tv is merged with another dimension in the output tv, preventing the merge on reference tv to be propagated to the target. This PR adds an optional check `areAllOutputIdsMappedTo` in `nvfuser::pointwise_utils::DomainMap::isValidReference` The added check in this PR checks that all source producer IterDomain producing the IterDomain on outputs are covered by reference tv. This is safe for pointwise scheduler, since the scheduler checks that there's no reversible view present in the fusion. The check is optional and is disabled by transpose scheduler, where the reference_tv is not supposed to cover the entire fusion, but rather a subset of fusion IO tensors. We should extent that in future PRs. --------- Co-authored-by: Naoya Maruyama <[email protected]> Co-authored-by: Jacob Hinkle <[email protected]>
Fixes: #3512
When picking reference tv, pointwise scheduler fails to validate that the transformation on reference tv can be safely propagated to all outputs in the fusion. The issue occurs when an IterDomain that's not in the reference tv is merged with another dimension in the output tv, preventing the merge on reference tv to be propagated to the target.
This PR adds an optional check
areAllOutputIdsMappedTo
innvfuser::pointwise_utils::DomainMap::isValidReference
The added check in this PR checks that all source producer IterDomain producing the IterDomain on outputs are covered by reference tv. This is safe for pointwise scheduler, since the scheduler checks that there's no reversible view present in the fusion.
The check is optional and is disabled by transpose scheduler, where the reference_tv is not supposed to cover the entire fusion, but rather a subset of fusion IO tensors. We should extent that in future PRs.