Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

open-telemetry / opentelemetry-python-contrib Public

Notifications You must be signed in to change notification settings
Fork 701
Star 836

Code
Issues 507
Pull requests 165
Discussions
Actions
Projects 2
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Add OpenAI embeddings instrumentation #3461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

drewby wants to merge 16 commits into open-telemetry:main

base: main

Choose a base branch

Loading

Loading

from drewby:openai-embeddings

Open

Add OpenAI embeddings instrumentation #3461

drewby wants to merge 16 commits into open-telemetry:main from drewby:openai-embeddings

+22,288 −45

Conversation 17 Commits 16 Checks 719 Files changed 24

Conversation

Copy link

Member

drewby commented May 4, 2025

Description

This PR adds instrumentation for OpenAI's embeddings API in the GenAI instrumentation suite. The implementation follows the OpenTelemetry semantic conventions for generative AI systems and provides automatic instrumentation for the OpenAI Python client when using embeddings functionality.

The implementation captures important metadata about embedding operations including model, dimensions, and relevant timing information while respecting sensitive data handling practices.

Added instrumentation for both synchronous and asynchronous OpenAI embedding API calls
Implemented span and metrics using existing attributes, with two new custom:
- ai.embedding.dimensions - Number of dimensions in the embedding vectors
- ai.embedding.encoding_format - The encoding format of the embedding vectors response (base64 or float)
Capturing input text content (disabled by default for privacy)
Added a usage example called embeddings

Type of change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Unit tests using mock responses to verify proper span creation and attribute population
Integration tests with the OpenAI client against a mock server
Manual testing using examples/embeddings with real OpenAI service

Does This PR Require a Core Repo Change?

Yes. - Link to PR:
No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

Followed the style guidelines of this project
Changelogs have been updated
Unit tests have been added
Documentation has been updated

Sorry, something went wrong.

All reactions

drewby added 5 commits

May 3, 2025 04:30


          Initial implementation and tests

ab95ef6


          Add embeddings example

df2dd1f


          Update documentation

a360307


          Changelog entry

92ccc30


          Add comment about custom attributes

3929fa9

drewby requested a review from a team as a code owner

May 4, 2025 03:18

github-actions bot assigned alizenhom, codefromthecrypt, gyliu513, karthikscale3, lmolkova, lzchen and nirga

github-actions bot requested review from alizenhom, codefromthecrypt, gyliu513, karthikscale3, lmolkova, lzchen and nirga

May 4, 2025 03:18

drewby added 2 commits

May 4, 2025 03:19


          Update PR link in Changelog

ced260a


          Add input and output events for embeddings

1966c7a

lmolkova reviewed

View reviewed changes

Copy link

Contributor

lmolkova left a comment

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!!!

Sorry, something went wrong.

All reactions

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py

+                      # Emit input event
+                      input_event_attributes = {
+                          GenAIAttributes.GEN_AI_SYSTEM: GenAIAttributes.GenAiSystemValues.OPENAI.value,
+                          EventAttributes.EVENT_NAME: "gen_ai.embeddings.input",

Copy link

Contributor

lmolkova May 4, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd postpone defining any new events until we have clarity on open-telemetry/semantic-conventions#2010

Also why not gen_ai.user.message?

Sorry, something went wrong.

All reactions

Copy link

Member Author

drewby May 6, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording inputs/outputs is a fundamental feature we discussed, so I think we need to record as something. The current implementation for chat completions is events, so that would be the most consistent until we decide on 2010. If the decision is to move to attributes, we are going to need to do breaking change PR anyway.

I could reuse the existing event name, but its not really a user message in the same way as text completions.

Sorry, something went wrong.

All reactions

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py

+                          output_event_attributes = {
+                              GenAIAttributes.GEN_AI_SYSTEM: GenAIAttributes.GenAiSystemValues.OPENAI.value,
+                              EventAttributes.EVENT_NAME: "gen_ai.embeddings.output",

Copy link

Contributor

lmolkova May 4, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my prev comment, I'd rather not define any new events for now

Sorry, something went wrong.

All reactions

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated

                       GenAIAttributes.GEN_AI_SYSTEM: GenAIAttributes.GenAiSystemValues.OPENAI.value,
                       GenAIAttributes.GEN_AI_REQUEST_MODEL: span_attributes[
                           GenAIAttributes.GEN_AI_REQUEST_MODEL
                       ],
                   }
+                  if "gen_ai.embeddings.dimensions" in span_attributes:

Copy link

Contributor

lmolkova May 4, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't use span attributes to record metrics - spans are sampled and won't have any attributes when sampled out.

Sorry, something went wrong.

xrmx reacted with thumbs up emoji

All reactions

👍 1 reaction

Copy link

Member Author

drewby May 6, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is called span_attributes, but it is from the previous PR.

Its just a dictionary created during the request by chat_completions_create and now embeddings_create, so not subject to sampling. I renamed it here in _record_metrics to just request_attributes for clarity. If needed, I can rename it in the create methods for better readability as well.

Sorry, something went wrong.

All reactions

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

xrmx reviewed

View reviewed changes

instrumentation-genai/opentelemetry-instrumentation-openai-v2/CHANGELOG.md Outdated Show resolved Hide resolved

instrumentation-genai/opentelemetry-instrumentation-openai-v2/examples/embeddings/README.rst Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/utils.py Outdated Show resolved Hide resolved

...opentelemetry-instrumentation-openai-v2/src/opentelemetry/instrumentation/openai_v2/patch.py Outdated Show resolved Hide resolved

drewby added 6 commits

May 6, 2025 06:22


          Fix changelog

f9c2a1e


          Use gen_ai.embeddings.dimension.count

4d1497e


          Remove total_tokens

ebb728a


          Use end_on_exit

5b5743b


          Don't import conditionally


          Fix heading

03e1bba

drewby added 3 commits

May 6, 2025 07:41


          Use gen_ai.request.encoding_formats

94b4ad3


          Use gen_ai.request.encoding_formats

1c12412


          Rename span_attr to request_attr

168d40e

lmolkova added this to GenAI Semantic Conventions and Instrumentation libraries

lmolkova moved this to In Progress in GenAI Semantic Conventions and Instrumentation libraries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

xrmx xrmx left review comments

lmolkova lmolkova left review comments

alizenhom Awaiting requested review from alizenhom

codefromthecrypt Awaiting requested review from codefromthecrypt

gyliu513 Awaiting requested review from gyliu513

karthikscale3 Awaiting requested review from karthikscale3

lzchen Awaiting requested review from lzchen

nirga Awaiting requested review from nirga

At least 1 approving review is required to merge this pull request.

Assignees

codefromthecrypt

Labels

None yet

Projects

GenAI Semantic Conventions and Instrumentation libraries

Status: In Progress

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

9 participants

Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.