Optimize CPU time spent in inference path #682

ericcraw · 2025-05-01T18:22:46Z

Move input/output name to ort/ov input output bindings to compilation.
Reduce tensor lookups by name in favor of index look ups.

Move input/output name to ort/ov input output bindings to compilation. Reduce tensor lookups by name in favor of index look ups.

javier-intel · 2025-05-12T22:26:09Z

onnxruntime/core/providers/openvino/backend_utils.cc

@@ -121,15 +121,15 @@ std::istream& operator>>(std::istream& stream, SharedContext::SharedWeights::Met
 namespace backend_utils {

 bool IsDebugEnabled() {
-  const std::string env_name = onnxruntime::GetEnvironmentVar("ORT_OPENVINO_ENABLE_DEBUG");


Suggestion, can we pull these into the context or subcontext instead of checking all over the place? It's fine if we do this change for not just wondering.

Yeah, I don't see a reason why it can't be moved. Though I'll defer that for now since making it static effectively removes the environment var checks anyway.

javier-intel · 2025-05-12T22:29:17Z

onnxruntime/core/providers/openvino/backends/basic_backend.cc

-      ORT_ENFORCE(!input_name.empty(), log_tag,
-                  "Input names mismatch between OpenVINO and ONNX. ", onnx_input_name,
-                  " doesn't exist in the list of OpenVINO input tensor names");
+    bool cpu_or_gpu = (session_context_.device_type.find("CPU") != std::string::npos ||


Since I'm spreading my musings here I might as well mention that I'd like to avoid doing these string compare all over the place and have a test for selected devices that can be easily and quickly be tested for.

Absolutely. I was tempted to do that as well, but thinking about the meta devices (auto, multi, hetero, etc) complicated just enough that I didn't want to go down that rabbit hole (yet). 😄

ericcraw · 2025-05-12T23:43:46Z

@sfatimar would you mind helping this PR along? I have a follow up I want to make on top of this to benefit other devices.

sfatimar · 2025-05-13T07:23:40Z

@MayureshV1 please follow up.

MayureshV1 · 2025-05-14T23:49:58Z

Spoke to Eric and reviewed the changes. These look good to me.

Unknowns: Impact on abstract devices like AUTO. This is not an artifact of Eric's changes but we might have some abstract devices that might not get the full benefit because of device checks in basic_backend. @preetha-intel , need to look at that aspect.

Eric has spot tested but we need to validate for functionality and perf impact across below flows:
Inference w/ ONNX model, w/ OV Cache_DIR and w/ EPCtx

jatinwadhwa921 · 2025-05-21T05:29:06Z

7 unit test are failing, please fix them

ericcraw changed the base branch from master to ovep-develop May 1, 2025 18:23

Optimize CPU time spent in inference path

b217424

Move input/output name to ort/ov input output bindings to compilation. Reduce tensor lookups by name in favor of index look ups.

ericcraw force-pushed the inference_opt branch from f1b5725 to b217424 Compare May 2, 2025 01:00

javier-intel approved these changes May 12, 2025

View reviewed changes

sfatimar requested a review from MayureshV1 May 13, 2025 07:23

Merge branch 'ovep-develop' into inference_opt

95442f2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize CPU time spent in inference path #682

Optimize CPU time spent in inference path #682

ericcraw commented May 1, 2025

javier-intel May 12, 2025

ericcraw May 12, 2025

javier-intel May 12, 2025

ericcraw May 12, 2025

ericcraw commented May 12, 2025

sfatimar commented May 13, 2025

MayureshV1 commented May 14, 2025

jatinwadhwa921 commented May 21, 2025

Optimize CPU time spent in inference path #682

Are you sure you want to change the base?

Optimize CPU time spent in inference path #682

Conversation

ericcraw commented May 1, 2025

javier-intel May 12, 2025

Choose a reason for hiding this comment

ericcraw May 12, 2025

Choose a reason for hiding this comment

javier-intel May 12, 2025

Choose a reason for hiding this comment

ericcraw May 12, 2025

Choose a reason for hiding this comment

ericcraw commented May 12, 2025

sfatimar commented May 13, 2025

MayureshV1 commented May 14, 2025

jatinwadhwa921 commented May 21, 2025