middleware [@hono/prometheus] Observe timings when response finish

Which middleware is the feature for?

@hono/prometheus

What is the feature you are proposing?

Hi. I added the @hono/prometheus to my project, one using Mastra AI, but noticed that streaming answers are not being measured as I expected. For example, a 1s stream will show on http_request_duration_seconds_sum as having taken less than a ms. This was something I noticed earlier in other middlewares, both on Hono and other servers, which make timings around await next() instead of listening to the finish event of http.ServerResponse. An isolated reproduction example, with plain Hono on Node.js: https://github.com/moret/hono-prometheus-streaming . I have a couple of questions about this.

First is, am I expecting something conceptually misguided - e.g.: the "request duration" is just the handling of the request, so this metric indeed should not be seen as the full response time. I mean, I do have this expectation in non-streaming responses, but I can see that the name might imply it, meaning this would actually require another metric.

On that note, second question / suggestion, do you see this as another metric that could be added as an option to the middleware - at least on Node.js deployments - or an adjustment on the current one?

Finally, I initially asked this on the community Discord, and got the suggestion to bring it here. Is this the best place to discuss or suggest this as an option or adjustment to the middleware? If not, could you help me guide this to a better place?

Thank you all in advance.

May 22 '25 18:05 moret

Cross referencing: https://discord.com/channels/1011308539819597844/1012485912409690122/1375155736815992873

May 22 '25 18:05 moret

Hi @moret

Regarding the third-party middleware hosted in this repository, the author of the middleware handles issues and pull requests. So @dios-david, do you have any idea?

It's my thought. I haven't done a deep dive, but can changing the registration order fix it? For example, add the middleware on top of the routes.

May 22 '25 22:05 yusukebe

Hi @moret, author of this middleware here. I think all your questions are valid, and I'm not fully sure what metric I would expect to see on streaming responses - the first byte or the time taken to finish the stream. Personally I'm leaning towards your idea of introducing a new metric which would measure the completion time of streaming responses. Thanks for the repro example, I will take a look into that!

May 22 '25 22:05 dios-david

Hello folks. As I was using the example / reproduction mentioned, I noticed that if an error is thrown inside streamText it was not triggering the "low level" http.ServerResponse.on('error') callback. I updated the example a bit to also capture that, but I'm not sure if the form I used would be helpful to a middleware, as it depends on the handler calling back something in case of error. I tried using app.onError, but it's not called for streaming errors. Maybe this global streaming error support to allow middlewares such as @hono/prometheus would be the better way, I don't know. Anyway, I wanted to comment back that the example has this considerable caveat. 😓

May 28 '25 14:05 moret