Description
The Web ML Working Group charter says "The Model Loader API needs a standard format supported across browsers and devices for broad interoperability."
I'm creating this issue to ask, what should the standard format be?
For now, the prototype is using TensorFlow Lite. All models start off in either the TensorFlow SavedModel format or Keras format.
See the list of TF Lite operations that are supported.
The reason TF Lite doesn't just use SavedModel, a protocol buffer format, is mainly to reduce the file size. In addition to some compression, the conversion process supports post-training quantization.
Some more thoughts:
-
Operator definitions: The community group has spent a lot of time on the operator definitions for the WebNN spec. We should keep those for Model Loader.
-
Serialization format: Apple's CoreML, TensorFlow's SavedModel, and Microsoft's ONNX all use protocol buffers as the serialization format. TensorFlow Lite converts the SavedModel to a flatbuffer. I'm not sure if Apple or Microsoft also have compression or conversion steps, or if they use the protobuf directly.