-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Compile API: support for OrtModel input and write output to stream #24740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Compile API: support for OrtModel input and write output to stream #24740
Conversation
file_offset, | ||
tensor_byte_size, | ||
gsl::make_span(reinterpret_cast<char*>(unpacked_tensor.data()), tensor_byte_size))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: fixes incomplete handling of external data that is stored in memory with the tag kTensorProtoMemoryAddressTag
. I ran into this when serializing an OrtModel (created via editor api) to a model proto. There's a test in this PR that triggers this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is re-worked in my OrtValue initializers PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also an external PR to address. What's the preferred approach? Get something in now and replace with the re-worked OrtValue initializers PR?
ORT_API2_STATUS(ModelCompilationOptions_SetInputModel, _In_ OrtModelCompilationOptions* model_compile_options, | ||
_In_ const OrtModel* input_model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with both variants of OrtModel? One completely created with the model editor API and one that augments and existing model? I think it's fine to only support the former.
typedef OrtStatus*(ORT_API_CALL* OrtOutStreamWriteFunc)(_In_ void* stream_state, | ||
_In_ const void* buffer, | ||
_In_ size_t buffer_num_bytes, | ||
_Out_ size_t* num_bytes_written); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need num_bytes_written
? doesn't that complicate things vs. requiring the function implementer to handle all bytes they were given?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Removed.
model_saving_options); | ||
size_t buffer_size = model_proto.ByteSizeLong(); | ||
ORT_RETURN_IF(buffer_size > static_cast<size_t>(std::numeric_limits<int>::max()), | ||
"Cannot serialize ONNX ModelProto larger than 2GB"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to be a problem for the WebNN use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. As far as I'm aware, onnx (protobuf) does not support files larger than 2gb, so weights would have to be external for large models. Generating external weight files would seem to be an issue for web. hmm
file_offset, | ||
tensor_byte_size, | ||
gsl::make_span(reinterpret_cast<char*>(unpacked_tensor.data()), tensor_byte_size))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also an external PR to address. What's the preferred approach? Get something in now and replace with the re-worked OrtValue initializers PR?
size_t input_model_data_size_ = 0; | ||
|
||
std::variant<std::monostate, // Initial state (no input model) | ||
std::string, // input model path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to support wide chars? IIRC the paths in the C API do and using std::filesystem is a simple way to do that.
Description
OrtModel
instance that was created with the model editor API.Examples
More examples can be found in the unit tests.
C++
Compiles an
OrtModel
to a stream that itself writes to a file:Motivation and Context
Provide flexibility to users of the Compile API that, like those compiling for webnn, that want to open/write files outside of ORT.
Next PR: allow loading input model from an existing file handle.