forked from chromium/chromium
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support dumping model cache for OV EP #137
Open
shiyi9801
wants to merge
60
commits into
ort_backend
Choose a base branch
from
enable_model_cache
base: ort_backend
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…wer preference and device type
* Pass ORT_API_VERSION to `OrtApiBase::GetApi()` Also removes the inclusion of onnx.pb.h header. * Add third_party/onnxruntime_headers Import https://github.com/microsoft/onnxruntime/tree/main/include Commit is based on microsoft/onnxruntime#23223 * Use ORT Model Builder API * Refactor scoped ORT type ptr 1. Rename to ScopedOrtTypePtr 2. Use macros 3. Introduce `operator T*()` 4. Introduce `Release()` method 5. Rename `get_ptr()` to `Get()` 6. Rename `get_pptr()` to `GetAddressOf()` * Remove ONNX Runtime headers from third_party/microsoft_dxheaders
* Introduce webnn_use_ort build flag and enable it for Windows * Introduce --webnn-use-ort switch It would override DirectML backend when it is enabled. * Remove the non-working DML EP code path for GPU and NPU All context options would use CPU EP for now. OpenVINO EP will be used for GPU and NPU devices. * Allow loading onnxruntime.dll from system folder
1. Add reduce and instance_norm ops 2. Refactor some codes including : rename `uint64_t NewInitializerAsRawData` ==> `std::string CreateInitializerAsRawData` remove unused `ORT_ABORT_ON_ERROR`
Update ORT headers to the latest Model Builder API: microsoft/onnxruntime@4e2d061 According to the latest API, the node will own attributes. This PR releases attributes after calling `AddNode()`. This PR also changes the `CreateAttribute()` to return a `ScopedOrtOpAttrPtr` to simplify the code.
When `--webnn-ort-use-openvino` switch is used, OpenVINO EP will be used for all WebNN contexts. WebNN device type will map to OpenVINO EP device type. With this change, developers (like me) can test OpenVINO EP on CPU. (My dev machine doesn't have Intel GPU or NPU). Usage: 1. Build OpenVINO EP by following: https://onnxruntime.ai/docs/build/eps.html#openvino **Note**: Please use OpenVINO version >= 2024.4 (tested on 2024.6) 3. Copy the following DLLs into Chromium build folder or version folder ``` onnxruntime.dll onnxruntime_providers_shared.dll onnxruntime_providers_openvino.dll ``` 4. Ensure OpenVINO environment variables are set, i.e. ``` "C:\Program Files (x86)\Intel\openvino_2024\setupvars.bat" ``` 5. Need to append --no-sandbox to load nessceary DLLs into GPU process, i.e. ``` chrome.exe --webnn-use-ort --use-redist-ort --webnn-ort-use-openvino --no-sandbox ```
1. Remove the unnecessary parameter ` OperandDataType data_type` of `CreateInitializer` method, map the data type to onnx tensor type 2. Add a helper method `CreateScalarInitializer` to create scalar with empty shape
) This PR extracts environment, allocator and memory info out of `AllocatorOrt` and eliminates the need for it at the current stage. Environment should be initialized earlier before any other ORT API calls (e.g. using logger) and must be released after releasing all sessions (otherwise #75). Environment is reference counted. The first `CreateEnv()` will create the instance and following `CreateEnv()` increase its reference count and returns the reference to the instance. Upon the last reference is removed, the environment instance is released. This PR puts a reference of `OrtEnv` inside `GraphImplOrt::Session` prior to `OrtSession` to ensure the releasing order. At the current stage, we only use CPU allocator, so we can just get the pointer of the default CPU allocator. The memory info can be just CPU memory info. It is unclear whether and how we need a custom allocator for particular device. #65 Other changes include: 1. Introduce `TensorImplOrt::Create()` which allows to report an error for any ORT API failures rather than crash. 2. Similarly, allow `GraphImplOrt::CreateAndBuild()` to report an error for any ORT API failures. 3. Use scoped ORT types for `BufferContentOrt`, `OrtEnv`, `OrtSession`, `OrtSessionOptions` and `OrtMemoryInfo`. Fix #75
Fix #87 1. refactor codes for inserting cast node 2. support logical not and fix bugs for all logical operators
Fix #63 This PR renames the operands to make sure each name is unique, refactors `ComputeResources` for using new operands names.
This PR refactors/simplifies codes fo error handling: 1. Define `ScopedOrtValuePtr` which is responsible for releasing the `OrtStatus*` 2. Add some micro definitions, for example: `CALL_ORT_FUNC` will convert the original `OrtStatus*` type to `ScopedOrtValuePtr` 3. Let some methods return `ScopedOrtStatusPtr ` , for example: `ScopedOrtStatusPtr OrtModelBuilder::AddInitializer`
* replace base::ranges with std::ranges * add RankRange for some ops
* softmax requires axis so the input can't be a scalar * split requires axis so the input can't be a scalar * triangular only supports input rank >= 2
@huningxin PTAL, thanks! |
huningxin
reviewed
Feb 13, 2025
std::string cache_dir; | ||
if (dump_directory.has_value()) { | ||
cache_dir = base::SysWideToUTF8(dump_directory->value()); | ||
openvino_options.cache_dir = cache_dir.c_str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we should have a separate switch to enable OV model caching. I don't think it is equivalent to SetOptimizedModelFilePath of ORT which is used for ONNX model inspection. OV model caching intends to reduce the graph compilation time.
79e0772
to
d984069
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #72