You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I appreciate the effort and time you’ve invested in creating and sharing these exercises and reference solutions. Thank you for supporting the community and providing such valuable resources!
I’ve been working through your exercises. While reviewing the reference solution, I encountered a part of the code that I believe might contain an error in calculating the weight offsets.
Issue Details
In the problem, weights are stored in 4 bits, with FPINT (e.g., 8) weights packed into a single 32-bit integer. The reference solution computes the weight offset as follows:
off_weight_l=l+tl.arange(0, B_MID//FPINT)
However, based on my understanding, since each FPINT weights are packed together, the offset calculation should account for the packing by dividing the index l by FPINT. I believe the correct calculation should be:
off_weight_l=l//FPINT+tl.arange(0, B_MID//FPINT)
This adjustment ensures that the weight indices are correctly mapped to their respective packed positions within the 32-bit integers. Without this division, the offset might incorrectly reference unpacked indices, potentially leading to erroneous weight retrieval and subsequent computation.
Reference Code Snippet
Here is the relevant portion of the reference answer for context:
# load weight# note: our weight will be stored in 4bits.off_weight_l=l+tl.arange(0, B_MID//FPINT)
mask_weight_l=off_weight_l< (MID//FPINT)
off_weight=off_j[:, None] * (MID//FPINT) +off_weight_l[None, :]
mask_weight=mask_j[:, None] &mask_weight_l[None, :]
weight=tl.load(weight_ptr+off_weight, mask=mask_weight)
My Concern
By not dividing l by FPINT, the off_weight_l may not correctly represent the indices of the packed weights. Since each packed integer contains FPINT weights, failing to perform this division could result in accessing incorrect memory locations, leading to incorrect weight values being used in the computation.
Request for Clarification
Could you please verify if the offset calculation for off_weight_l in the reference solution is correct?
Thank you for your assistance!
The text was updated successfully, but these errors were encountered:
Hello,
I appreciate the effort and time you’ve invested in creating and sharing these exercises and reference solutions. Thank you for supporting the community and providing such valuable resources!
I’ve been working through your exercises. While reviewing the reference solution, I encountered a part of the code that I believe might contain an error in calculating the weight offsets.
Issue Details
In the problem, weights are stored in 4 bits, with
FPINT
(e.g., 8) weights packed into a single 32-bit integer. The reference solution computes the weight offset as follows:However, based on my understanding, since each
FPINT
weights are packed together, the offset calculation should account for the packing by dividing the indexl
byFPINT
. I believe the correct calculation should be:This adjustment ensures that the weight indices are correctly mapped to their respective packed positions within the 32-bit integers. Without this division, the offset might incorrectly reference unpacked indices, potentially leading to erroneous weight retrieval and subsequent computation.
Reference Code Snippet
Here is the relevant portion of the reference answer for context:
My Concern
By not dividing
l
byFPINT
, theoff_weight_l
may not correctly represent the indices of the packed weights. Since each packed integer containsFPINT
weights, failing to perform this division could result in accessing incorrect memory locations, leading to incorrect weight values being used in the computation.Request for Clarification
Could you please verify if the offset calculation for
off_weight_l
in the reference solution is correct?Thank you for your
assistance!
The text was updated successfully, but these errors were encountered: