Add Otel collector to our tracing exports to Tempo #5049

ndr-ds · 2025-12-01T15:33:14Z

Motivation

Direct export to Tempo from every pod with no sampling can create a very high volume of data. That can cause the proxy/shards to get backpressured and get filled with errors, as well as a performance hit in them.

A two-tier collector architecture (routers receiving from pods, samplers doing tail-based sampling) is more efficient.

Proposal

Add Kubernetes templates for the OTel collector infrastructure:

Router deployment template (receives traces from pods)
Sampler StatefulSet template (performs tail-based sampling)
Associated ConfigMaps and Services

Test Plan

Deploy the OTel collector infrastructure
Verify traces flow through the pipeline, and there's no bottleneck there anymore
Check sampling rates are correct

Release Plan

Nothing to do / These changes follow the usual release cycle.

ndr-ds · 2025-12-01T15:33:31Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

Move download_sender_certificates_for_receiver to proxy #5128 : 2 dependent PRs (#5131 , #5158 )
Send notifications from the proxy in batches #5129
Cache deserialized immutable data on storage #5141
Use quick_cache on ValueCache and introduce ParkingCache #5130
RocksDB config tuning #5122
Adding per block metrics to dashboard #5137
Add metrics to track operations/bundles/transactions per block #5136
Several dashboard changes #5124
Send notifications in batches #5054
Several metrics changes #5123
Make notification and cross chain queue sizes grow with network size #5053
Add node local DNS cache #5050
Add Otel collector to our tracing exports to Tempo #5049 👈 (View in Graphite)
Limit max pending message bundles in benchmarks #5121
Properly set proxy and shard resources #5120
Make tempo an actual chart dependency #5051 : 1 other dependent PR (#5052 )
Combine building binaries into 1 cargo build command #5125
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

deuszx · 2026-01-08T09:35:20Z

linera-service/src/tracing/opentelemetry.rs

+    // Configure batch processor for high-throughput scenarios
+    // Larger queue (16k instead of 2k default) to handle benchmark load
+    // Faster export (100ms instead of 5s default) to prevent queue buildup
+    let batch_config = opentelemetry_sdk::trace::BatchConfigBuilder::default()
+        .with_max_queue_size(16384) // 8x default, enough for 8 shards under load
+        .with_max_export_batch_size(2048) // Larger batches for efficiency
+        .with_scheduled_delay(std::time::Duration::from_millis(100)) // Fast export to prevent queue buildup
+        .build();
+
+    let batch_processor = BatchSpanProcessor::new(exporter, batch_config);


Would it make sense to have different configs for when running with benchmark feature and on production?

This was referenced Dec 1, 2025

Check block cache on create_cross_chain_requests #5013

Merged

Measure operation/message execution latency in microseconds #5007

Merged

Add cross chain message tasks to dashboard #4999

Merged

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from 41fd9c0 to 184f4e0 Compare December 2, 2025 12:55

ndr-ds force-pushed the 11-25-improve_dashboards branch 2 times, most recently from c42d64a to e439965 Compare December 2, 2025 13:26

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch 2 times, most recently from 85c2474 to c003b5b Compare December 3, 2025 21:30

ndr-ds force-pushed the 11-25-improve_dashboards branch 2 times, most recently from ad6f9ec to 2b0654b Compare December 4, 2025 12:31

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from c003b5b to 589cdd3 Compare December 4, 2025 12:31

ndr-ds changed the base branch from 11-25-improve_dashboards to graphite-base/5049 December 11, 2025 15:35

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from 589cdd3 to 8911a78 Compare December 11, 2025 15:35

ndr-ds changed the base branch from graphite-base/5049 to main December 11, 2025 15:36

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from 8911a78 to 463fffb Compare December 15, 2025 22:00

ndr-ds changed the base branch from main to graphite-base/5049 January 7, 2026 18:42

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from 463fffb to 38270c4 Compare January 7, 2026 18:42

ndr-ds changed the base branch from graphite-base/5049 to 12-03-limit_max_pending_message_bundles_in_benchmarks January 7, 2026 18:42

ndr-ds marked this pull request as ready for review January 7, 2026 22:21

Add Otel collector to our tracing exports to Tempo

be3ef0b

ndr-ds force-pushed the 11-25-add_otel_collector_to_our_tracing_exports_to_tempo branch from 38270c4 to be3ef0b Compare January 7, 2026 22:26

ndr-ds force-pushed the 12-03-limit_max_pending_message_bundles_in_benchmarks branch from 2681799 to 7ed6c81 Compare January 7, 2026 22:26

deuszx reviewed Jan 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Otel collector to our tracing exports to Tempo #5049

Add Otel collector to our tracing exports to Tempo #5049

Uh oh!

ndr-ds commented Dec 1, 2025 •

edited

Loading

Uh oh!

ndr-ds commented Dec 1, 2025 •

edited

Loading

Uh oh!

deuszx Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Otel collector to our tracing exports to Tempo #5049

Are you sure you want to change the base?

Add Otel collector to our tracing exports to Tempo #5049

Uh oh!

Conversation

ndr-ds commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Proposal

Test Plan

Release Plan

Uh oh!

ndr-ds commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deuszx Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ndr-ds commented Dec 1, 2025 •

edited

Loading

ndr-ds commented Dec 1, 2025 •

edited

Loading