Skip to content

QS8 gemm use V73 int to float and multiply for quantization #8242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 13, 2025

Conversation

copybara-service[bot]
Copy link
Contributor

QS8 gemm use V73 int to float and multiply for quantization

  • int to IEEE float is exact
  • hvx mpy float to qfloat
  • increase tolerance to difference of 1 for qfloat

- int to IEEE float is exact
- hvx mpy float to qfloat
- increase tolerance to difference of 1 for qfloat

PiperOrigin-RevId: 747153592
@copybara-service copybara-service bot merged commit 989eed0 into master Apr 13, 2025
@copybara-service copybara-service bot deleted the test_745880585 branch April 13, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant