Skip to content

Do FP8 rowwise bias addition in higher precision #4095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jwfromm
Copy link
Contributor

@jwfromm jwfromm commented May 8, 2025

Summary: Previously, when bias was used in our FP8 rowwise kernel, it was added to the accumulator in its native precision. For example, if the bias is bf16, we would do a bf16 + bf16 addition. However, it's a bit more efficient and a bit more accurate to leave the accumulator in fp32, cast the bias to fp32, then to an fp32 addition.

Differential Revision: D74408348

Copy link

netlify bot commented May 8, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 52487ea
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/681cf6ebb6d49f0008dff178
😎 Deploy Preview https://deploy-preview-4095--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D74408348

Summary:
X-link: facebookresearch/FBGEMM#1179


Previously, when bias was used in our FP8 rowwise kernel, it was added to the accumulator in its native precision. For example, if the bias is bf16, we would do a bf16 + bf16 addition. However, it's a bit more efficient and a bit more accurate to leave the accumulator in fp32, cast the bias to fp32, then to an fp32 addition.

Reviewed By: jianyuh

Differential Revision: D74408348
@jwfromm jwfromm force-pushed the export-D74408348 branch from 1a3516a to 52487ea Compare May 8, 2025 18:24
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D74408348

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants