HTTPX instrumentation - document how to log request payload and response on response_hook
Is your feature request related to a problem? After reading the httpx instrumentation docs it is clear I need a response_hook and how to use it. But it has no example how to really use this with the span, request, response attributes to log the payload from request and response.
After digging into __init__() I see both ResponseInfo and RequestInfo have a stream property that could potentially be the solution, but each of them are a different type:
class RequestInfo(typing.NamedTuple):
....
typing.Optional[
typing.Union[httpx.SyncByteStream, httpx.AsyncByteStream]
]
class ResponseInfo(typing.NamedTuple):
...
stream: typing.Iterable[bytes]
In those cases it is not very clear an easy way to log the information from the payload.
Describe the solution you'd like A simple example of response_hook in the docs with logging of payload of request and response.
We'd like to claim this.
@staticdev it seems the issue here is that response stream returned by httpx is an iterator and if the response hook reads it, it will be exhausted before your application will get to read it (see comment in response_hook below). However as suggested here and here you could create a LogResponse class that prints each chunk as your application consumes it, so you get at least some logging.
import httpx
from opentelemetry.instrumentation.httpx import SyncOpenTelemetryTransport, AsyncOpenTelemetryTransport
class LogResponse(httpx.Response):
def iter_bytes(self, *args, **kwargs):
for chunk in super().iter_bytes(*args, **kwargs):
print(chunk)
yield chunk
class LogTransport(httpx.BaseTransport):
def __init__(self, transport: httpx.BaseTransport):
self.transport = transport
def handle_request(self, request: httpx.Request) -> httpx.Response:
response = self.transport.handle_request(request)
return LogResponse(
status_code=response.status_code,
headers=response.headers,
stream=response.stream,
extensions=response.extensions,
)
def response_hook(span, request, response):
status_code, headers, stream, extensions = response
span.set_attribute("http.status_code", status_code)
# This will add response to span but then won't return because
# iterator would have been exhausted.
# r = b"".join(stream)
# print(r)
# span.set_attribute("http.response.body", r)
transport = LogTransport(httpx.HTTPTransport())
telemetry_transport = SyncOpenTelemetryTransport(transport, response_hook=response_hook)
url = "https://google.com/"
with httpx.Client(transport=telemetry_transport) as client:
response = client.get(url)
Perhaps there is a way to also pass span to LogResponse and collect stream chunks there, but I don't know enough about OpenTelemetry internals at this point to say if it's feasible and/or advisable :).
The current API is not compatible with stream. I'm happy to work on the removal, I'm not sure how to do it in a backwards-compatible manner tho.
Instead of passing the RequestInfo, maybe we can pass the httpx.Request itself? 🤔
Passing httpx.Request and httpx.Response instead of ReqestInfo and ResponseInfo would be perfect as we can also access the higher level methods of these classes (such as .json()) and the response contents are cached, so we won't have the problem of trying to consume the stream twice. I'm working on a project where we need to include the request and response contents in the span, so I'd be happy to implement this.
In terms of the design I see two options:
- We pass
httpx.Requestandhttpx.Responseinstead ofRequestInfoandResponseInfoin the hooks. Since the httpx classes contain the same fields that can be found in the httpx classes that shouldn't cause any issues for the vast majority of users - We could create seperate hooks passing the httpx methods and allow the previous ones to work as before (possibly deprecating them) to keep it 100% backwards compatible
I prefer option (1) as it would keep the API easier to use, especially for new users but if strict backwards compatibility is a concern I'm happy to implement option (2). Or another suggestion of course
@Kludex @rok @emdneto @lzchen what do you think?
We are also seeing this issue for the same use case (logging response bodies). It would be great to get a fix in, even if it means a breaking change!
Same issue here. We'd love to contribute the fix!
In terms of the solution, are we concerning on option 1 here? I kinda prefer it because this is the cleanest way for long-term despite the fact of this being a breaking change.
We pass httpx.Request and httpx.Response instead of RequestInfo and ResponseInfo in the hooks. Since the httpx classes contain the same fields that can be found in the httpx classes that shouldn't cause any issues for the vast majority of users
@rok ^ could you please advise?
Maybe we could just add the response: httpx.Response as an additional attribute to ResponseInfo? Such that we don't break the existing usage. Refer to: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation/opentelemetry-instrumentation-httpx/src/opentelemetry/instrumentation/httpx/init.py#L288
@josephwangrb I've not had time to work on this and I'm not likely to in the near future. Please feel free to take over and unassign me from this issue.