You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
X-link: facebookresearch/FBGEMM#1149
When the max_seq_len is larger than 8192, one input sample will be divided into multiple sequences. Such as:
When bs = 2, and seqlen = 7, we will have seq_lens = [0, 7, 7, 7, 7, 14, 14, 14, 14] in the prefill attention. In decoding, it won't as it's handled by the gappy bias.
Differential Revision: D73833204
0 commit comments