Proposal: onnx.json format (a human-readable intermediate representation) #24922
dsisco11
started this conversation in
Ideas / Feature Requests
Replies: 1 comment
-
This would probably be better placed on https://github.com/onnx/onnx, where the ONNX format is developed. ONNX already has a mechanism for storing tensor data in an external file, and I think given how protobuf works that the plain text protobuf format is probably also loadable (or at least can be loaded in as a protobuf and converted to the right binary format for ORT in memory). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Currently the most popular format for describing models is the HuggingFace diffusers json format.
This creates a number of obstacles for using ML models in languages other than python.
Proposal:
onnx.json
format (a human-readable intermediate representation)The "Unpacked" ONNX format should be a JSON-based representation of the model architecture, similar to the HuggingFace diffusers format, but designed to be more universal and language-agnostic.
Note: by "unpacked" we mean that the model architecture is represented in a human-readable format, rather than the binary ONNX format which is not easily readable or editable.
Also Note: this is NOT a proposal to replace the binary ONNX protobuf format by any means, but rather to create a new additional format that can be used as an intermediate (unoptimized) representation for sharing neural networks.
The format would be governed by an official JSON Schema (published by the ONNX team) to validate the structure and content of the model files.
Having a JSON Schema would not only allow for easy validation of the model files, but also enable tools (such as IDEs) to automatically provide assistance to model authors when writing an
onnx.json
file.In order to be human-readable and possible to edit within text editors, the
onnx.json
format obviously can't store Gigabytes of weights directly in the JSON file.Therefore, the format should allow for large binary data to be stored in external
.safetensors
files, while still allowing for binary-data to be embedded directly in the JSON (ideally allowing for both base64-strings and arrays of hex-encoded bytes).The SafeTensors format is a good candidate for storing weights, as it is designed to be efficient and safe for loading in various environments, as well as being used almost universally and being a very simple format to implement.
In order to aid in abstraction and reusability, the
onnx.json
format should also allow for defining reusable nodes or layers, which can be referenced in the main model export section.This would allow for defining common operations in a reusable way.
Abstraction of model architectures
To reduce development time for this new format, the first version could just focus on translating the existing ONNX proto items into their JSON equivalents.
However, in the long run, the
onnx.json
format should also provide abstractions for some of the most common model architectures and patterns.The largest candidates for this abstraction would be: UNet and Transformer architectures.
It should be noted that since ML technology is evolving rapidly, the
onnx.json
format should aim to be extensible and allow for future additions of new model architectures and patterns.This also means that providing abstractions for all logic within a models architecture is not feasible as new optimization strategies are constantly emerging.
The problem can be mitigated if developers are able to mutate the model graph after it is loaded but before it is compiled into the final ONNX proto format.
This would allow end applications to do things like implement novel optimization such as different attention mechanisms, or even injecting custom layers that are not part of the original model architecture.
With this approach, the ONNX team only needs to provide a single standard implementation of any given abstraction layer, and then the end application can override it and implement custom logic for that layer if needed.
This way, the
onnx.json
format can remain flexible and adaptable to future changes in the ML landscape!Mockup of an
onnx.json
fileHere is a mockup of what an
onnx.json
file could look like.Beta Was this translation helpful? Give feedback.
All reactions