Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wgsl: Convert quantizeToF16 to used hfround #3118

Merged
merged 1 commit into from
Oct 31, 2023

Conversation

zoddicus
Copy link
Contributor

Instead of passing the input through a F16Array, use the library provided function hfround. hfround is a fast look up table based rounding function for f16.

Benchmarking locally this provides a ~20% improvement to fma interval calculations, which are particularly sensitive to quantization cost. Overall I was seeing more on the order of ~10% improvement.


Requirements for PR author:

  • All missing test coverage is tracked with "TODO" or .unimplemented().
  • New helpers are /** documented */ and new helper files are found in helper_index.txt.
  • Test behaves as expected in a WebGPU implementation. (If not passing, explain above.)

Requirements for reviewer sign-off:

  • Tests are properly located in the test tree.
  • Test descriptions allow a reader to "read only the test plans and evaluate coverage completeness", and accurately reflect the test code.
  • Tests provide complete coverage (including validation control cases). Missing coverage MUST be covered by TODOs.
  • Helpers and types promote readability and maintainability.

When landing this PR, be sure to make any necessary issue status updates.

@zoddicus
Copy link
Contributor Author

There is probably a similar optimization to be had from using Math.fround in quantizeToF32

Instead of passing the input through a F16Array, use the library
provided function hfround. hfround is a fast look up table based
rounding function for f16.

Benchmarking locally this provides a ~20% improvement to fma interval
calculations, which are particularly sensitive to quantization
cost. Overall I was seeing more on the order of ~10% improvement.
@zoddicus zoddicus merged commit fc58db8 into gpuweb:main Oct 31, 2023
@zoddicus zoddicus deleted the usehfround branch October 31, 2023 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wgsl
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants