common HTTP Middleware Error Logging: Vision on StatusBadGateway/ServiceUnavailable

The http logging middleware splits out different request results and logs them as either debug or warn. Generally errors are logged as warn and successes are logged as debug.

https://github.com/weaveworks/common/blob/4b1847531bc94f54ce5cf210a771b2a86cd34118/middleware/logging.go#L56-L64

We need to log the below error conditions that are currently being logged as debug. Unfortunately, due to volume, we can't turn on debug logging.

statusCode == http.StatusBadGateway || statusCode == http.StatusServiceUnavailable

My guess is that these two statuses are logged at a debug level due to volume if the backend is unavailable. We would like to log these failures at a higher level than debug, but also recognize that the volume would be too great to log if a backend is down.

The change we'd like to make:

Move http.StatusBadGateway and http.StatusServiceUnavailable to be logged at a Warn level with the other errors
Use a configurable rate limited logger to log errors instead of logging 100% of all errors at Warn

Thoughts?

If this (or something similar) is acceptable I'd be glad to PR it.

@bboreham

Jul 13 '20 15:07 joe-elliott

Background is here: https://github.com/cortexproject/cortex/issues/810, http://github.com/weaveworks/common/pull/84

I would be ok with sampling the messages (we have -event.sample-rate in cortex already); I guess a rate limit is also fine

You could also sample after the line hits the logfile?

Aug 11 '20 13:08 bboreham

Thanks for the background. As suspected those two error codes were just overwhelming the logs and so they got removed. It sounds like you're ok with the general idea so I will submit a PR and we can discuss details there.

You could also sample after the line hits the logfile? Unsure what you mean by this. Like log everything but only push a certain subset of the logfile to the backend?

Aug 11 '20 20:08 joe-elliott