Skip to content

Conversation

@lalitb
Copy link
Member

@lalitb lalitb commented Jan 12, 2026

fixes: #1144

Summary

Adds HTTP/1.1 support for the OTLP receiver alongside the existing gRPC server. HTTP is disabled by default—existing gRPC-only deployments are unaffected.

Key Design Decisions

  1. Shared vs. Separate Concurrency
    • When HTTP is enabled, both protocols share a single Arc<Semaphore> per pipeline instance ((i.e., per core in thread-per-core deployments, not cross-core) so total in-flight requests respect downstream channel capacity regardless of protocol mix.
    • When HTTP is disabled, gRPC retains the original GlobalConcurrencyLimitLayer with no new overhead.

Why: The downstream bounded channel is the true bottleneck. If gRPC and HTTP had separate limits, a burst on one protocol could still overwhelm the shared channel. Sharing ensures backpressure is applied uniformly. Preserving GlobalConcurrencyLimitLayer for gRPC-only avoids introducing Arc overhead for deployments that don't need HTTP.

  1. Reuse Existing Infrastructure
    • Uses effect_handler.tcp_listener() for socket creation (inherits SO_REUSEPORT, keepalive settings)
    • Shares AckRegistry for wait-for-result ACK/NACK flow
    • Shares OtlpReceiverMetrics for unified observability

Why: Consistency and reduced maintenance. Operators see one set of metrics regardless of protocol; ACK/NACK behavior is identical; socket tuning is centralized in the engine.

  1. Lazy Decode (Zero-Copy Path)

    • HTTP body is wrapped as OtlpProtoBytes without deserialization, matching gRPC's lazy-decode strategy
    • JSON content-type not implemented because it would require deserialization, breaking zero-copy
  2. Send Bounds Trade-off

    • Tonic requires Send futures; HTTP shares Arc<Mutex<...>> state with gRPC for metrics (OtlpReceiverMetrics) and ACK slots (AckRegistry) - these are necessarily Arc-wrapped because tonic's service handlers require Send + Sync
      HTTP uses tokio::spawn (not spawn_local) and Arc rather than Rc.
    • Trade-off accepted to avoid duplicating metrics/ACK infrastructure

Why: Duplicating metrics/ACK state for a !Send HTTP path would add complexity and divergence. Since tonic already forces Send on the gRPC side, sharing state via Arc is the pragmatic choice. The atomic overhead is acceptable given the I/O-bound workload.

  1. Semaphore-Based Admission Control
    • HTTP uses semaphore.acquire_owned() with a timeout
    • Permit timeout: uses the configured http.timeout if set, otherwise falls back to 5s
    • If permit isn't acquired within timeout, returns 503 Service Unavailable
    • This differs from gRPC's GlobalConcurrencyLimitLayer which rejects immediately at poll_ready

Why: HTTP doesn't have Tower middleware, so we use a raw semaphore. The timeout allows brief queuing during bursts (fairer than immediate rejection) while still bounding wait time. Immediate rejection would cause more client retries and load amplification.

  1. Body Collection with Size Limits
    • Uses http_body_util::Limited to enforce max_request_body_size during body collection
    • Aborts early with 400 Bad Request if wire size exceeds limit
    • Dual-limit enforcement: limit checked again after decompression to prevent decompression bombs

Why: HTTP/1.1 requires buffering the full body before processing (unlike gRPC streaming). Without limits, a malicious client could exhaust memory. Dual enforcement (wire + decompressed) defends against both large payloads and zip bombs where a small compressed payload expands to gigabytes.

  1. Protobuf Only
    • JSON content-type not implemented (can be added later if needed)
    • Keeps initial scope focused; protobuf is the primary OTLP format

Why: JSON would require deserialization in the receiver, breaking the zero-copy strategy. Protobuf is the canonical OTLP format and what most SDKs use. JSON support can be added as an opt-in path later if there's demand.

  1. TCP Socket Tuning for keep-alive
    • hyper enables HTTP/1.1 keep-alive by default (connections reused across requests)
    • TCP-level keep-alive is configurable via tcp_keepalive settings

Key Changes

  • New Module: crates/otap/src/otlp_http.rs — HTTP/1.1 server with POST /v1/{logs,metrics,traces}
  • Decompression: gzip, deflate, zstd via Content-Encoding
  • Config: Optional http: section; omit to keep gRPC-only behavior

Documentation

  • Updated docs/otlp-receiver.md

Configuration Example:

Nodes:
  receiver:
    plugin_urn: "urn:otel:otlp:receiver"
    config:
      listening_addr: "0.0.0.0:4317"
      max_concurrent_requests: 0  # auto-tune
      
      http:  # Optional - omit to disable HTTP
        listening_addr: "0.0.0.0:4318"
        max_request_body_size: "4MiB"
        accept_compressed_requests: true
        timeout: "30s"

Limitations

  • JSON content-type not supported (protobuf only)
  • HTTP/2 not supported on HTTP server (gRPC uses HTTP/2 separately via tonic)
  • Response compression not implemented
  • HTTP shares Arc-backed metrics/ACK state with tonic; a split !Send HTTP path would require separate state and isn't planned here.

@github-actions github-actions bot added the rust Pull requests that update Rust code label Jan 12, 2026
@lalitb lalitb changed the title Grpc http receiver Add support for OTLP/HTTP Receiver Jan 12, 2026
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 83.30475% with 292 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.38%. Comparing base (bd62852) to head (d9c1fdf).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1765      +/-   ##
==========================================
- Coverage   84.40%   84.38%   -0.02%     
==========================================
  Files         496      499       +3     
  Lines      145393   147090    +1697     
==========================================
+ Hits       122716   124124    +1408     
- Misses      22143    22432     +289     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 85.63% <83.30%> (-0.05%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.52% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
quiver 90.66% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lquerel
Copy link
Contributor

lquerel commented Jan 13, 2026

@lalitb It's really great to have built-in HTTP support at the OTLP receiver level.

I'm waiting for a more detailed description of the PR and the approach taken before doing a deeper review. Thanks by advance.

@lalitb
Copy link
Member Author

lalitb commented Jan 13, 2026

I'm waiting for a more detailed description of the PR and the approach taken before doing a deeper review. Thanks by advance.

Thanks @lquerel - This is still a draft while I finish benchmarks and a final review. I’ll add a detailed PR description and update the README with a summary of the changes shortly.

state: settings
.wait_for_result
.then(|| AckSlot::new(settings.max_concurrent_requests)),
state,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor just moved the AckSlot construction out of ServerCommon::new into the receiver so HTTP and gRPC can share the same slot pool when HTTP wait-for-result is enabled.

let std_stream: std::net::TcpStream = socket.into();
std_stream.set_nonblocking(true)?;
TcpStream::from_std(std_stream)
socket_options::apply_socket_options(
Copy link
Member Author

@lalitb lalitb Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket tuning is now centralized in socket_options.rs because both the proxy and the OTLP/HTTP server need the same keepalive/nodelay configuration (tokio -> std -> socket2 -> std -> tokio dance). This keeps the settings consistent across listeners and avoids duplicating the conversion code.

{
if let Some(acceptor) = maybe_tls_acceptor.clone() {
let shutdown = shutdown.clone();
let _ = tracker.spawn(async move {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses TaskTracker/tokio::spawn, which forces these per-connection handlers to be Send. If we want !Send HTTP handlers, serve (the HTTP server function) would need to run on a LocalSet and use spawn_local with a local tracker for draining. Note that serve is currently treated as a Send future and shares Arc-backed metrics/semaphore with the tonic gRPC path (Send + Sync), so a !Send HTTP path would also require decoupling that shared state. This is documented in the otlp_receiver.md too.

}

let shutdown = shutdown.clone();
let _ = tracker.spawn(async move {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment.

@lalitb
Copy link
Member Author

lalitb commented Jan 14, 2026

I'm waiting for a more detailed description of the PR and the approach taken before doing a deeper review. Thanks by advance.
Thanks @lquerel - This is still a draft while I finish benchmarks and a final review. I’ll add a detailed PR description and update the README with a summary of the changes shortly.

@lquerel - Have updated the PR description, and also docs/otlp_receiver.md with approach and design details.

@lalitb lalitb marked this pull request as ready for review January 14, 2026 21:11
@lalitb lalitb requested a review from a team as a code owner January 14, 2026 21:11
Copy link
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not finished the review yet, but there is one point that is bothering me and that I would like to discuss before continuing. Maybe I do not yet have the full picture, so please take the following with a grain of salt.

If I understand correctly, when HTTP is enabled, gRPC switches from GlobalConcurrencyLimitLayer to SharedConcurrencyLayer. The difference lies in where the semaphore is enforced:

  • GlobalConcurrencyLimitLayer gates in poll_ready. When at capacity, the service is not ready, so HTTP/2 stops accepting new streams and applies backpressure. This effectively bounds the number of in-flight requests to the configured limit.
  • SharedConcurrencyLayer forwards poll_ready directly to the inner service and only waits on the semaphore inside call. This means the server can accept an unbounded number of new streams and spawn futures that then sit parked waiting for a permit. Those parked futures still own decoded request payloads and metadata, so memory usage grows with the number of queued requests.

Is my analysis accurate? If so, I think this is a real problem, because it means memory is effectively unbounded, which is something we want to avoid as much as possible. There is no fixed cap on the number of pending requests once the semaphore is saturated. With a default 4 MiB max message size, even a few thousand queued requests could turn into multiple gigabytes of memory.

I think we need to find a way to reintroduce backpressure at the poll_ready level while still using your shared semaphore.

To avoid any OOM risk, a possible fix would be to reintroduce backpressure in poll_ready while still sharing the semaphore. For example, SharedConcurrencyLayer could acquire a permit in poll_ready like GlobalConcurrencyLimitLayer does and stash it for call, so that a not-ready state propagates back to tonic and limits stream acceptance.

@lalitb
Copy link
Member Author

lalitb commented Jan 15, 2026

I have not finished the review yet, but there is one point that is bothering me and that I would like to discuss before continuing. Maybe I do not yet have the full picture, so please take the following with a grain of salt.

If I understand correctly, when HTTP is enabled, gRPC switches from GlobalConcurrencyLimitLayer to SharedConcurrencyLayer. The difference lies in where the semaphore is enforced:

  • GlobalConcurrencyLimitLayer gates in poll_ready. When at capacity, the service is not ready, so HTTP/2 stops accepting new streams and applies backpressure. This effectively bounds the number of in-flight requests to the configured limit.
  • SharedConcurrencyLayer forwards poll_ready directly to the inner service and only waits on the semaphore inside call. This means the server can accept an unbounded number of new streams and spawn futures that then sit parked waiting for a permit. Those parked futures still own decoded request payloads and metadata, so memory usage grows with the number of queued requests.

Is my analysis accurate? If so, I think this is a real problem, because it means memory is effectively unbounded, which is something we want to avoid as much as possible. There is no fixed cap on the number of pending requests once the semaphore is saturated. With a default 4 MiB max message size, even a few thousand queued requests could turn into multiple gigabytes of memory.

I think we need to find a way to reintroduce backpressure at the poll_ready level while still using your shared semaphore.

To avoid any OOM risk, a possible fix would be to reintroduce backpressure in poll_ready while still sharing the semaphore. For example, SharedConcurrencyLayer could acquire a permit in poll_ready like GlobalConcurrencyLimitLayer does and stash it for call, so that a not-ready state propagates back to tonic and limits stream acceptance.

@lquerel - Thanks for flagging this. You're right: with the shared layer acquiring the permit inside call, gRPC can accept an unbounded number of HTTP/2 streams that park waiting for permits, each holding its request payload in memory. I'll fix this as you suggested.

On the HTTP side, we acquire the permit before collecting the body, with a timeout (default 5s or http.timeout). Pending connections can still accumulate if the timeout is set high, but they hold only connection state and headers - the body isn't collected until after the permit is granted. The MB-scale memory concern is specific to gRPC. Happy to add stricter connection gating for HTTP if needed.

Copy link
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the unbounded memory issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

OTLP/HTTP Receiver

2 participants