Rate limited errors
This PR attempts to address concerns in #192. It provides
- A rate limited logger
- A method of configuring the logger for high volume http errors through the server
- Fixed possible race condition
While working on this PR I noticed this:
func (l Log) logWithRequest(r *http.Request) logging.Interface {
traceID, ok := ExtractTraceID(r.Context())
if ok {
l.Log = l.Log.WithField("traceID", traceID)
}
return user.LogWith(r.Context(), l.Log)
}
Particularly the line:
l.Log = l.Log.WithField("traceID", traceID)
I'm unsure why the struct var is being overwritten in logWithRequest. This seems like a possible race condition to me? I assume there's one instance of the middleware that is shared by all requests in the pipeline?
why the struct var is being overwritten
it's a local variable, so not likely to cause a problem, but this was made less confusing in #193
Sorry I haven't been back here for a while. Code looks ok now; do you still want to merge it? Have you tried it out for real?
@joe-elliott ping on questions
Honestly, this has kind of fallen off of my radar. The initial need for this no longer something I am no longer involved in. Having said that I do think the rate limited logger has some value perhaps in weaveworks/common if we want to keep it.
The original reason this existed was to catch errors returned from Loki in the cortex-gw when the query frontend crashed. @cyriltovena @owen-d what do you think? is there still a need for this?
Right now I have a high rate of errors like this:
level=warn ts=2021-03-01T11:35:02.01609324Z caller=grpc_logging.go:38 method=/cortex.Ingester/Push duration=2.741131ms err="rpc error: code = Code(429) desc = user=redacted: per-metric series limit (local limit: 300000 global limit: 0 actual local limit: 300000) exceeded for series {redacted}" msg="gRPC\n"
I think this PR as it stands would not help me, because it is focused on one specific http error.