You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
) is 256+32 (within a single warp, with all threads arriving)? Why does the non-varlen case require arrive before sync, while the varlen case only needs sync without arrive?
The text was updated successfully, but these errors were encountered:
Does anyone understand why the number of sync operations in the FA3 store function (
flash-attention/hopper/epilogue_bwd_sm90_tma.hpp
Line 161 in 0dfb281
The text was updated successfully, but these errors were encountered: