Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow HTTPX instrumentation to capture more request info, and improve API docs #655

Open
samuelcolvin opened this issue Dec 9, 2024 · 8 comments
Assignees
Labels

Comments

@samuelcolvin
Copy link
Member

Description

We should add the following kwargs to instrument_httpx:

  • capture_request_headers: bool | None = None
  • capture_request_body: bool | None = None
  • capture_response_headers: bool | None = None
  • capture_response_body: bool | None = None
  • capture_all: bool = False - just sets all the above to True

We can raise an error if you use these together with async_request_hook etc.

We can add a note saying that capturing the body in production might increase the amount of data collected significantly.

Also HTTPXInstrumentKwargs is not included in docs which makes it basically undocumented.

We should probably do the same for requests.

@samuelcolvin samuelcolvin added Feature Request P2 A bit less than P1. labels Dec 9, 2024
@alexmojaki alexmojaki self-assigned this Dec 9, 2024
@samuelcolvin samuelcolvin added P1 High priority and removed P2 A bit less than P1. labels Dec 13, 2024
@alexmojaki
Copy link
Contributor

@Kludex is currently working on headers in #671 which is close to finishing.

After that comes capturing JSON bodies which @Kludex will also work on, I've provided him with a PoC implementation in slack.

Then #668 can go through.

We discussed eventually capturing other types of bodies:

  1. Forms, which could be done by walking up the stack to retrieve the data from a frame instead of parsing bytes
  2. Arbitrary non-streamed bodies
  3. Arbitrary streamed bodies

I briefly toyed with the last case, leaving the code here in case we want it:

from dataclasses import dataclass
from typing import Iterable
import httpx
from httpx._content import IteratorByteStream
import logfire
from logfire.propagate import attach_context, get_context, ContextCarrier
from opentelemetry.trace import get_current_span, Span


def content():
    yield b"Hello, "
    yield b"world!"


logfire.configure()


@dataclass
class LoggingStream:
    stream: Iterable[bytes]
    ctx: ContextCarrier
    current_span: Span

    def __iter__(self):
        result = []
        for chunk in self.stream:
            result.append(chunk)
            yield chunk
        body = b"".join(result)
        if self.current_span.is_recording():
            # TODO this requires the bytes to be decodable as text
            self.current_span.set_attribute("http.body", body)
        else:
            with attach_context(self.ctx):
                logfire.info("Streamed body", body=body)


def request_hook(span, info):
    record_body(info)


def record_body(info):
    if isinstance(info.stream, IteratorByteStream):
        info.stream._stream = LoggingStream(
            info.stream._stream, get_context(), get_current_span()
        )


logfire.instrument_httpx(request_hook=request_hook)

r = httpx.post("https://httpbin.org/post", content=content())
print(r.text)

@Kludex
Copy link
Member

Kludex commented Dec 24, 2024

What else are we missing to close this? I can work on it today.

@alexmojaki
Copy link
Contributor

  1. capture_response_text_body
  2. Documentation of all the capture_* args
  3. capture_all
  4. Replacing Unpack[HTTPXInstrumentKwargs]
  5. Optionally refactoring response info similar to capture_request_text_body #722
  6. Optionally moving the enhanced request/response info classes into the public logfire.integrations.httpx module and documenting them with example hooks, the simplest being how to capture only request or response headers with a single method call in a hook.

@alexmojaki
Copy link
Contributor

@Kludex can you do 2-4?

@Kludex
Copy link
Member

Kludex commented Dec 30, 2024

You working on 3? If so, I can work on 2-4 after.

@alexmojaki
Copy link
Contributor

I am not.

@Kludex
Copy link
Member

Kludex commented Dec 30, 2024

I am not.

I don't understand if you are going to, if you think it shouldn't be added, or if you are implying that I can and should work on it.

@Kludex Kludex self-assigned this Dec 30, 2024
@Kludex
Copy link
Member

Kludex commented Dec 30, 2024

Waiting review:

I'll document tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants