-
Notifications
You must be signed in to change notification settings - Fork 39
Optimize CPU time spent in inference path #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ovep-develop
Are you sure you want to change the base?
Conversation
Move input/output name to ort/ov input output bindings to compilation. Reduce tensor lookups by name in favor of index look ups.
@@ -121,15 +121,15 @@ std::istream& operator>>(std::istream& stream, SharedContext::SharedWeights::Met | |||
namespace backend_utils { | |||
|
|||
bool IsDebugEnabled() { | |||
const std::string env_name = onnxruntime::GetEnvironmentVar("ORT_OPENVINO_ENABLE_DEBUG"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion, can we pull these into the context or subcontext instead of checking all over the place? It's fine if we do this change for not just wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't see a reason why it can't be moved. Though I'll defer that for now since making it static effectively removes the environment var checks anyway.
ORT_ENFORCE(!input_name.empty(), log_tag, | ||
"Input names mismatch between OpenVINO and ONNX. ", onnx_input_name, | ||
" doesn't exist in the list of OpenVINO input tensor names"); | ||
bool cpu_or_gpu = (session_context_.device_type.find("CPU") != std::string::npos || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'm spreading my musings here I might as well mention that I'd like to avoid doing these string compare all over the place and have a test for selected devices that can be easily and quickly be tested for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. I was tempted to do that as well, but thinking about the meta devices (auto, multi, hetero, etc) complicated just enough that I didn't want to go down that rabbit hole (yet). 😄
@sfatimar would you mind helping this PR along? I have a follow up I want to make on top of this to benefit other devices. |
@MayureshV1 please follow up. |
Spoke to Eric and reviewed the changes. These look good to me. Unknowns: Impact on abstract devices like AUTO. This is not an artifact of Eric's changes but we might have some abstract devices that might not get the full benefit because of device checks in basic_backend. @preetha-intel , need to look at that aspect. Eric has spot tested but we need to validate for functionality and perf impact across below flows: |
Move input/output name to ort/ov input output bindings to compilation.
Reduce tensor lookups by name in favor of index look ups.