-
Notifications
You must be signed in to change notification settings - Fork 78
feat: Add support for OpenTelemetry #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
krlmlr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lovely!
For what functions does it not make sense to implement telemetry? I guess dbQuote*(), what else?
Does this supersede https://github.com/r-dbi/dblog? Do you think a non-invasive approach like used there would be feasible here as well? What is the overhead if no listeners are active?
If we need to add here, a suggested package would be preferred.
hadley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a great first step!
My preliminary thoughts are that we should instrument all 'full transactions', where we might be interested in the length of the spans. Otel expects all spans to be short-lived. Hence for example, we don't have a span that starts with
Gabor has spent a lot of time to make the interface as user-friendly as possible. I think the idea that you can get instrumentation for free with no code changes, and can leave it on in production is a powerful proposition. If not active, there is practically no overhead - the current main instrumentation function
Updated to suggests in d6213f6. |
|
I've updated this PR to cover the high-level operations - let me know if any obvious ones are missing. Instrumenting the lower level ones would result in much more (noisy) output. Live link here: https://logfire-eu.pydantic.dev/public-trace/73de9eac-7379-4581-b285-845a7a52c56b?spanId=eddfab36dba4de22 Re. documentation, let me know if you have a particular preference here e.g. if you want to stick with a news item (knitr), or have a separate vignette (mirai). |
?otelsdk::collecting for more details on configuring OpenTelemetry
krlmlr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. My understanding is that this is opt-in, and that tracing for DBI can be disabled even if tracing for other sources is enabled. I wonder if we can emit a banner message when connecting that points to relevant documentation?
Yes, you're right - and detailed in the otelsdk instrumentation docs.
I'm thinking that in some cases it may be a system admin which has set up otel collection rather than the end user. So it may be surprising for the user to see a banner, especially as they wouldn't then know what to do with the information. |
?otelsdk::collecting for more details on configuring OpenTelemetry|
As this rolls out across more packages, we'll do more to promote it, so hopefully folks start to internalise that this sort of observability is available in all the packages they rely on the most. |
|
I've now updated this PR with a common approach on caching the tracer, and a testing helper (following discussions with @schloerke who's been spearheading the otel integration in Shiny/promises). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds basic OpenTelemetry instrumentation to DBI, implementing tracing for database operations following the OpenTelemetry semantic conventions for database spans. The implementation provides observability into database operations by creating spans for connections, queries, and table operations.
Key changes:
- Core OpenTelemetry infrastructure with lazy initialization and tracer caching
- Instrumentation added to generic database operations (connect/disconnect, queries, table operations)
- Test coverage for OpenTelemetry tracing functionality
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| R/otel.R | New file implementing core OpenTelemetry helper functions for tracer management, span creation, and SQL query attribute extraction |
| R/zzz.R | Added tracer initialization call in .onLoad hook |
| R/DBI-package.R | Added .onLoad function to initialize OpenTelemetry tracer |
| R/dbConnect.R | Added OpenTelemetry span instrumentation for database connection |
| R/dbDisconnect.R | Added OpenTelemetry span instrumentation for database disconnection |
| R/dbGetQuery.R | Added OpenTelemetry span instrumentation for query execution |
| R/dbGetQueryArrow.R | Added OpenTelemetry span instrumentation for Arrow query execution |
| R/dbReadTable.R | Added OpenTelemetry span instrumentation for table reading |
| R/dbReadTableArrow.R | Added OpenTelemetry span instrumentation for Arrow table reading |
| R/13-dbWriteTable.R | Added OpenTelemetry span instrumentation for table writing |
| R/23-dbWriteTableArrow.R | Added OpenTelemetry span instrumentation for Arrow table writing |
| R/11-dbAppendTable.R | Added OpenTelemetry span instrumentation for table appending |
| R/21-dbAppendTableArrow.R | Added OpenTelemetry span instrumentation for Arrow table appending |
| R/12-dbCreateTable.R | Added OpenTelemetry span instrumentation for table creation |
| R/22-dbCreateTableArrow.R | Added OpenTelemetry span instrumentation for Arrow table creation |
| R/dbRemoveTable.R | Added OpenTelemetry span instrumentation for table removal |
| tests/testthat/test-otel.R | New test file validating OpenTelemetry span creation and attributes |
| DESCRIPTION | Added otel and otelsdk to Suggests dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
R/dbGetQuery.R
Outdated
| otel_local_active_span( | ||
| dynGet("attributes")$db.operation.name, | ||
| conn, | ||
| label = dynGet("attributes")$db.collection.name, | ||
| attributes = make_query_attributes(statement) |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynGet("attributes") calls on lines 65 and 67 attempt to retrieve the value of attributes before it's assigned on line 68, which will fail. The dynGet function searches for a variable in parent frames, but attributes is being assigned as a parameter in the current call, not in a parent frame.
This should be refactored to compute the attributes first, then use them:
setGeneric("dbGetQuery", def = function(conn, statement, ...) {
attributes <- make_query_attributes(statement)
otel_local_active_span(
attributes$db.operation.name,
conn,
label = attributes$db.collection.name,
attributes = attributes
)
standardGeneric("dbGetQuery")
})| otel_local_active_span( | |
| dynGet("attributes")$db.operation.name, | |
| conn, | |
| label = dynGet("attributes")$db.collection.name, | |
| attributes = make_query_attributes(statement) | |
| attributes <- make_query_attributes(statement) | |
| otel_local_active_span( | |
| attributes$db.operation.name, | |
| conn, | |
| label = attributes$db.collection.name, | |
| attributes = attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shikokuchuo: Can you confirm that dynGet() is doing what Copilot thinks it's doing? I understand that we want to run make_query_attributes() only once, and only if we actually record the span. How about:
| otel_local_active_span( | |
| dynGet("attributes")$db.operation.name, | |
| conn, | |
| label = dynGet("attributes")$db.collection.name, | |
| attributes = make_query_attributes(statement) | |
| otel_local_active_span( | |
| attributes$db.operation.name, | |
| conn, | |
| label = attributes$db.collection.name, | |
| attributes = { attributes <- make_query_attributes(statement) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratch that. The code in make_query_attributes() is already brittle. Should we record source code locations instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot doesn't understand dynGet() so it is somewhat wide of the mark. dynGet() is somewhat designed precisely for use in arguments like this, but I do avoid the use unless it's absolutely necessary. In this case, I think it adds more complexity than it abstracts away, so I've created a separate otel_query_local_active_span() in beddef5 to make the logic here much simpler.
I've also taken the opportunity to make it somewhat more robust to possible variations in the query. Recording source code locations is an interesting alternative - possibly more useful for debugging, but I'm conscious that performing and parsing a sys.call() or equivalent for every call (not just in error cases) seems a bit wasteful. Also this may be less useful in code that's actively changing. On the other hand, the current approach which extracts db.operation.name (e.g., SELECT) and db.collection.name (table name) aligns with the otel semantic conventions for database spans.
krlmlr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great! I'll play with it locally to get a feeling as well.
| label = collection_name(name, conn), | ||
| attributes = list( | ||
| db.collection.name = collection_name(name, conn), | ||
| db.operation.name = "INSERT INTO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we distinguish between Arrow and data frame source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think distinguishing Arrow will be useful? I'm thinking the database operations would be the same, so I'd default to not doing anything, but we could add an attribute here to all the Arrow variants if you prefer.
| otel_is_tracing <- FALSE | ||
|
|
||
| otel_cache_tracer <<- function() { | ||
| requireNamespace("otel", quietly = TRUE) || return() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if otel is installed during the session? Can we somehow support this use case?
Will otel print diagnostics on the console if it's active, by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If otel is installed mid-session, the user would need to restart R or call DBI:::otel_cache_tracer() directly for it to take effect. I don't think it's worth adding a public function here as we don't expect this to be a common use case. A typical workflow would be to have otel installed before starting a session. We've taken this approach for pretty much all the R packages that have been instrumented to date.
Regarding console output - by default, otel doesn't print diagnostics to the console. It only exports spans to a configured collector/backend. Users would still need to explicitly configure this via the OTEL_TRACES_EXPORTER env var, but stdout or stderr are options.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
Thanks @krlmlr for looking over this, appreciate it! Sorry it's been a while over the holidays and such, I hope I've adequately addressed your comments above. |

This PR implements basic OpenTelemetry instrumentation for DBI.
Abides by the otel semantic conventions for database spans as far as possible (with considerations for limitations of the R API, performance etc.).
The following is a screenshot of the spans created by running the examples for
dbGetQuery(). This trace may also be examined interactively at this public link (30 day validity):https://logfire-eu.pydantic.dev/public-trace/a3da0166-cf62-43de-b194-864bf3c9e33d?spanId=77e385b051f86076
Implementation progress:
dbConnect/dbDisconnect,dbCreateTable/dbRemoveTable,dbGetQueryTodo:
dbAppendTable,dbWriteTable/dbReadTableand all Arrow variants- [ ] Add documentation(covered by news item + separate article for other packages)I've assumed otel to be an 'imports' package for simplicity, but it shouldn't be a problem to move to 'suggests' if that's the preference.