Skip to content

Commit

Permalink
wgsl: Convert quantizeToF32 to used Math.fround (#3119)
Browse files Browse the repository at this point in the history
Instead of passing the input through a F32Array, use the builtin
Math.fround.

This leads to a ~5% improvement benchmarking locally. This is less
than the equivalent f16 change, because F32Array is provided by the
runtime, whereas F16Array is being polyfilled, so is probably more
efficient to begin with.
  • Loading branch information
zoddicus authored Oct 31, 2023
1 parent fc58db8 commit cb5b33c
Showing 1 changed file with 1 addition and 5 deletions.
6 changes: 1 addition & 5 deletions src/webgpu/util/math.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2013,13 +2013,9 @@ export interface QuantizeFunc {
(num: number): number;
}

/** Statically allocate working data, so it doesn't need per-call creation */
const quantizeToF32Data = new Float32Array(new ArrayBuffer(4));

/** @returns the closest 32-bit floating point value to the input */
export function quantizeToF32(num: number): number {
quantizeToF32Data[0] = num;
return quantizeToF32Data[0];
return Math.fround(num);
}

/** @returns the closest 16-bit floating point value to the input */
Expand Down

0 comments on commit cb5b33c

Please sign in to comment.