Specifies scalar values casted to match input type. #678

philloooo · 2024-05-06T18:25:36Z

Some of the operations have scalar parameters, e.g. linear, gemm's alpha and beta.
These are passed as float numbers. I think we should specify in the processing algorithm that they get casted to match with the input type. So that if input is float16, they also get downcast to float16.

Background is - in some cases, CoreML requires the parameter types to match. So we either downcast the scalar params to float16, or upcast input operands to float32, the latter is less ideal because on CoreML only float16 get executed on NPU.

Any concerns with adding this step in the algorithm?

The text was updated successfully, but these errors were encountered:

inexorabletash · 2024-05-06T18:54:44Z

Relevant:

Clarify the casting behavior from floating-point / signed integers <-> unsigned integers #489 - defining casting generally
Clarify the usage of 32 bit floating point type and consider using double #325 - and worth noting that casting fp64 → fp16 and fp64 → fp32 → fp16 may not yield the same value; that is, if an input is downcast to fp16, whether we define an input in WebIDL as float (fp32) or double (fp64) would give different results as the former implies two stage conversion from a JS Number!

inexorabletash · 2024-05-06T20:22:35Z

Here's one way this could be specified, using linear() as an example.

...

Let alpha be options.alpha, cast to input's dataType.

Let beta be options.beta, cast to input's dataType.

...

...

Let operator be an operator for the linear operation, given alpha and beta.

...

And then add a definition of cast like:

To cast a number number to a given MLOperandDataType dataType, perform the following steps:

Switch on dataType
↪ "float32"
Return the result of converting a JavaScript value to an IDL unrestricted float value with number.
↪ "float16"
TODO
↪ "int32"
Return the result of converting a JavaScript value to an IDL long value with number.
↪ "uint32"
Return the result of converting a JavaScript value to an IDL unsigned long value with number.
↪ "int64"
Return the result of converting a JavaScript value to an IDL long long value with number.
↪ "uint64"
Return the result of converting a JavaScript value to an IDL unsigned long long value with number.
↪ "int8"
Return the result of converting a JavaScript value to an IDL byte value with number.
↪ "uint8"
Return the result of converting a JavaScript value to an IDL octet with number.

With caveats:

The input to this only makes sense if it's a (unrestricted) double which is equivalent to the JavaScript number type. See Introduce MLNumber for specifying numeric inputs of any type #647 for discussions around BigInt support, which is necessary for full-precision int64 inputs.
IDL doesn't have a half/float16 type (yet?) - we can copy/paste/tweak, reference https://tc39.es/proposal-float16array, etc
This intentionally gives the conversion the same result as if we had explicit IDL for the types, for example:

dictionary MLLinearFloat32Options {
  float alpha = 1;
  float beta = 1;
};
dictionary MLLinearFloat16Options {
  half alpha = 1; // this type doesn't actually exist
  half beta = 1; // this type doesn't actually exist
};
partial interface MLGraphBuilder {
  MLOperand linearFloat32(MLOperand input, optional MLLinearFloat32Options options = {});
  MLOperand linearFloat16(MLOperand input, optional MLLinearFloat16Options options = {});
};

... which is probably more relevant for integer types, because IDL+JS have well defined if sometimes surprising behavior here. See #489 for discussion about casting within backends, but at least at the IDL/interface level we should behave predictably.

inexorabletash · 2024-05-08T19:00:08Z

More notes:

I have a more detailed local sketch of the cast algorithm that doesn't rely on reaching into non-exported parts of WebIDL, and addresses some of the limitations (bigint, float16, etc). We may want to roll this into Introduce MLNumber for specifying numeric inputs of any type #647
Is this a comprehensive list of where we'd want to directly invoke "cast" or are there more? Are there instances here we don't need to cast? (resample2d seems sketchy to me given Simplify resample2d op #474)
- constant(value, type) - value
- batchNormalization() - epsilon
- clamp() (both op and activation) - minValue and maxValue
- elu() (both op and activation) - alpha
- gemm() - alpha and beta
- hardSigmoid() (both op and activation) - alpha and beta
- instanceNormalization() - epsilon
- layerNormalization() - epsilon
- leakyRelu() (both op and activation) - alpha
- linear() (both op and activation) - alpha and beta
- pad() - value
- resample2d() - scales

inexorabletash · 2024-05-08T19:12:50Z

Also, what conversion options would we use for these? Specifically:

For casts to a floating point type, we can make this restricted (disallow +Infinity/-Infinity/NaN) or allow them
For casts to an integral type, we can make them act like [EnforceRange] (e.g. -1 cast to uint8 throws), [Clamp] (e.g. -1 cast to uint8 yields 0), or neither (C-style modulus e.g. -1 cast to uint8 yields 255)

This sounds like #489 but that issue is specifically about the casting operator, which might depend on the underlying platform. This issue is talking about the JS interface which is going to be implemented by the user agent.

- Introduce a "cast" definition that takes a number and a type, and returns the number cast to that type. - Invoke cast during MLOperand and MLActivation creation. TODO: - Passing restrictions - Floating point - allow Infinities/NaNs or not? - Integer - throw or clamp if out of range? - Simplify supported restrictions - resample2d sizes option - is this part of the op data or not?

inexorabletash · 2024-05-09T16:58:30Z

And then a further note for consideration: if we're being explicit about casting everywhere, are there any places that are currently float (that we're considering making double, see #325) that could be MLNumber as well so a developer could pass either 123456789123456789 or 123456789123456789n ?

This is fairly moot as all of the relevant ops accept only floating point inputs, so whatever the developer supplies will be ultimately cast to a float32 or float16, and so specifying the input as double rather than MLNumber doesn't lose detail, except the an exception will be thrown if 123456789123456789n is passed.

inexorabletash · 2024-05-09T18:38:11Z

Unsorted notes from internal discussion w/ @a-sully and @philloooo:

For batchNorm's epsilon.

WebNN specfies that the input may be float16 or float32
in DML, epsilon is always a float32
in CoreML, epsilon can be float16 or float32
TFLite doesn't naively support batchNorm so it would be polyfilled (and then we'd need to reason about the data types in more detail)

And:

DML is basically float32 everywhere (elu's alpha, gemm's alpha/beta, hardSigmoid, etc); if the alpha is float32 but the input's data type is float16 what ends up happening?
CoreML in iOS15 it required these parameters to be the same dtype as the input; only in iOS17 was this restriction relaxed for many operators

So overall, the "cast parameters to the same data type as the input tensors" approach may be okay and likely the least surprising for developers even if the result gets upcast (e.g. fp16 to fp32) again, but there's a lot of nuance to investigate.

- Introduce a "cast" definition that takes a number and a type, and returns the number cast to that type. - Invoke cast during MLOperand and MLActivation creation. TODO: - Passing restrictions - Floating point - allow Infinities/NaNs or not? - Integer - throw or clamp if out of range? - Simplify supported restrictions - resample2d sizes option - is this part of the op data or not?

anssiko added the feature request label May 8, 2024

inexorabletash added a commit to inexorabletash/webnn that referenced this issue May 8, 2024

WIP for webmachinelearning#678

ee629be

philloooo mentioned this issue May 13, 2024

Add fp16 precision support webmachinelearning/webnn-samples#222

Open

inexorabletash mentioned this issue May 14, 2024

Introduce MLNumber for specifying numeric inputs of any type #647

Merged

inexorabletash mentioned this issue Jun 27, 2024

WebML WG Teleconference – 27 June 2024 - Open issues and PRs webmachinelearning/meetings#24

Open

huningxin closed this as completed in #647 Jul 5, 2024

huningxin closed this as completed in 9f88ebf Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specifies scalar values casted to match input type. #678

Specifies scalar values casted to match input type. #678

philloooo commented May 6, 2024 •

edited

Loading

inexorabletash commented May 6, 2024

inexorabletash commented May 6, 2024

inexorabletash commented May 8, 2024

inexorabletash commented May 8, 2024

inexorabletash commented May 9, 2024

inexorabletash commented May 9, 2024 •

edited

Loading

Specifies scalar values casted to match input type. #678

Specifies scalar values casted to match input type. #678

Comments

philloooo commented May 6, 2024 • edited Loading

inexorabletash commented May 6, 2024

inexorabletash commented May 6, 2024

inexorabletash commented May 8, 2024

inexorabletash commented May 8, 2024

inexorabletash commented May 9, 2024

inexorabletash commented May 9, 2024 • edited Loading

philloooo commented May 6, 2024 •

edited

Loading

inexorabletash commented May 9, 2024 •

edited

Loading