Proposal: onnx.json format (a human-readable intermediate representation) #24922

dsisco11 · 2025-06-01T23:25:21Z

dsisco11
Jun 1, 2025

Currently the most popular format for describing models is the HuggingFace diffusers json format.
This creates a number of obstacles for using ML models in languages other than python.

Diffusers is limited to python and unusable in other languages.
The binary ONNX format is not human readable and requires tools to inspect or build a model file.
Conversion from Diffusers to ONNX is non-trivial and requires installing and interacting with python and all the required libraries to instantiate a model in the runtime so it can be captured.

Proposal: `onnx.json` format (a human-readable intermediate representation)

The "Unpacked" ONNX format should be a JSON-based representation of the model architecture, similar to the HuggingFace diffusers format, but designed to be more universal and language-agnostic.

Note: by "unpacked" we mean that the model architecture is represented in a human-readable format, rather than the binary ONNX format which is not easily readable or editable.

Also Note: this is NOT a proposal to replace the binary ONNX protobuf format by any means, but rather to create a new additional format that can be used as an intermediate (unoptimized) representation for sharing neural networks.

The format would be governed by an official JSON Schema (published by the ONNX team) to validate the structure and content of the model files.
Having a JSON Schema would not only allow for easy validation of the model files, but also enable tools (such as IDEs) to automatically provide assistance to model authors when writing an onnx.json file.

In order to be human-readable and possible to edit within text editors, the onnx.json format obviously can't store Gigabytes of weights directly in the JSON file.
Therefore, the format should allow for large binary data to be stored in external .safetensors files, while still allowing for binary-data to be embedded directly in the JSON (ideally allowing for both base64-strings and arrays of hex-encoded bytes).
The SafeTensors format is a good candidate for storing weights, as it is designed to be efficient and safe for loading in various environments, as well as being used almost universally and being a very simple format to implement.

In order to aid in abstraction and reusability, the onnx.json format should also allow for defining reusable nodes or layers, which can be referenced in the main model export section.
This would allow for defining common operations in a reusable way.

Abstraction of model architectures

To reduce development time for this new format, the first version could just focus on translating the existing ONNX proto items into their JSON equivalents.
However, in the long run, the onnx.json format should also provide abstractions for some of the most common model architectures and patterns.
The largest candidates for this abstraction would be: UNet and Transformer architectures.

It should be noted that since ML technology is evolving rapidly, the onnx.json format should aim to be extensible and allow for future additions of new model architectures and patterns.
This also means that providing abstractions for all logic within a models architecture is not feasible as new optimization strategies are constantly emerging.
The problem can be mitigated if developers are able to mutate the model graph after it is loaded but before it is compiled into the final ONNX proto format.
This would allow end applications to do things like implement novel optimization such as different attention mechanisms, or even injecting custom layers that are not part of the original model architecture.
With this approach, the ONNX team only needs to provide a single standard implementation of any given abstraction layer, and then the end application can override it and implement custom logic for that layer if needed.
This way, the onnx.json format can remain flexible and adaptable to future changes in the ML landscape!

Mockup of an `onnx.json` file

Here is a mockup of what an onnx.json file could look like.

{
    "$schema": "https://onnx.ai/schemas/onnx-model-schema-2025-1.json",
    // Metadata about the model, such as name, version, description, author, license, and tags. Essentially ignored by the ONNX runtime, but useful for documentation and discovery by downstream tools.
    "metadata": {
        "model": {
            "name": "example_model",
            "version": "1.0.0",
            "description": "An example ONNX model in unpacked JSON format.",
            "author": "Your Name",
            "license": "Apache-2.0",
            "tags": ["example", "onnx", "unpacked"]
        },
        "i_am_a_third_party_section": {
            "description": "third-party tools can use custom sections under 'metadata' to store additional info.",
            "custom_data": "This is a custom field for third-party tools."
        }
    },
    // Inputs of the model, which are the expected tensor shapes and types that the model will process
    "inputs": [
        {
            "name": "my_input1",
            "description": "The first input tensor for the model.",// description of the input tensor which can help users understand its purpose
            "shape": [1, 3, 224, 224],
            "dtype": "uint32"
        },
        {
            "name": "timestep",
            "description": "A single timestep value for the model.",
            "shape": [1],
            "dtype": "float32",
            "default": [0.0], // default value for the timestep input
            "optional": true // indicates that this input is optional
        },
        {
            "name": "encoder_hidden_states",
            "shape": [1, 77, 768],
            "dtype": "float32"
        }
    ],
    // Outputs of the model, which are the results after processing the inputs through the model
    "outputs": [
        {
            "name": "output1",
            "shape": [1, 1000],
            "dtype": "float32",
            "value": { "ref": "export/relu1" } // maps "output1" to the output value from "relu1" in the export section
        },
        {
            "name": "output2",
            "shape": [1, 3, 224, 224],
            "dtype": "float32",
            "value": { "ref": "export/conv1" } // maps "output2" to the output value from "conv1" in the export section
        }
    ],
    // Anything which is external to the current onnx.json file (such as a weights file, etc) which the model requires. 
    "imports": [
        {
            "name": "model_weights",
            "type": "safetensors",
            "value": "./weights/model.safetensors"
        },
        {
            "name": "imported_model",
            "type": "onnx",
            "value": "./imported_model.onnx"
        }
    ],
    // Named variables, which are referenceable data such as weights, biases, etc.
    "constants": [
        {
            "name": "unet_weights",
            "type": "tensor",
            "dtype": "float32",
            "shape": [1, 3, 224, 224],
            "value": {
                // References to values defined elsewhere in the file should always be in the form of a "reference object" (eg: `{"ref": <path>}`), so that references (which have to be resolved as a separate step anyways) are both explicit and consistent.
                "ref": "imports/model_weights/any_named_layer_from_safetensors_file" // reference to the imported weights file
            }
        },
        {
            "name": "conv1_weights",
            "type": "tensor",
            "dtype": "float32",
            "shape": [3, 3],
            "value": {
                "type": "base64",
                "value": "base64_encoded_weights_string_here"
            }
        },
        {
            "name": "scaling_factor",
            "type": "tensor",
            "dtype": "float32",
            "shape": [1],
            "value": [0.5]
        }
    ],
    // Definitions of reusable nodes or layers, which can be referenced in the export section
    "definitions": [
        {
            "name": "conv1",
            "type": "conv",
            "attributes": {
                "kernel_size": [3, 3],
                "stride": [1, 1],
                "padding": [1, 1]
            },
            "weights": {
                "ref": "constants/conv1_weights"
            }
        },
        {
            "name": "scaled_input",
            "type": "subgraph",// represents a compound node that encompasses multiple operations
            "nodes": {
                {
                    "name": "input_node",
                    "type": "tensor",// represents a raw tensor
                    "shape": [1, 3, 224, 224],
                    "dtype": "float32",
                    "value": {
                        "ref": "inputs/my_input1"// reference to the input defined above in the inputs section
                    }
                },
                {
                    "name": "scale_node",
                    "type": "mul",// represents a multiply operation
                    "arguments": [
                        "input_node",
                        { "ref": "constants/scaling_factor" } // reference to the constant defined above
                    ],
                }
            }
        },
    ],
    // The primary exported model structure, which is a series of nodes
    "export": [
        {
            "name": "first_node",
            "type": {"ref": "definitions/scaled_input"}, // reference to the reusable node defined above
        },
        {
            "name": "conv1",// <-- note that this node has a different path than the reusable definition (eg: "export/conv1").
            "type": {"ref": "definitions/conv1"}, // reference to the reusable conv node defined above
            "arguments": ["first_node"]// references the output of the node "first_node", implicitly assumed to mean "export/first_node".
        },
        {
            "name": "relu1",
            "type": "relu",
            "arguments": ["conv1"], // references the output of the "conv1" node
            "attributes": {},
            "weights": {
                "name": "relu1_weights",
                "type": "hex",
                "value": [
                    0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
                    0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10
                ]
            }
        }
    ]
}

Craigacp · 2025-06-02T16:12:48Z

Craigacp
Jun 2, 2025

This would probably be better placed on https://github.com/onnx/onnx, where the ONNX format is developed. ONNX already has a mechanism for storing tensor data in an external file, and I think given how protobuf works that the plain text protobuf format is probably also loadable (or at least can be loaded in as a protobuf and converted to the right binary format for ORT in memory).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: onnx.json format (a human-readable intermediate representation) #24922

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Proposal: onnx.json format (a human-readable intermediate representation) #24922

Uh oh!

dsisco11 Jun 1, 2025

Proposal: onnx.json format (a human-readable intermediate representation)

Abstraction of model architectures

Mockup of an onnx.json file

Replies: 1 comment

Uh oh!

Uh oh!

Craigacp Jun 2, 2025

dsisco11
Jun 1, 2025

Proposal: `onnx.json` format (a human-readable intermediate representation)

Mockup of an `onnx.json` file

Craigacp
Jun 2, 2025