You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm puzzled as to why the act_range of q_proj is being calculated in the scale for int8_kv_cache? Because the scale is only used to quantify the output of k_proj and v_proj.