Background periodic publisher doesn't recover from a network exception
If a trace collector is temporarily down, a background thread that tries to reach it is expected to survive flushSpans throwing ConnectionFailure:
HttpExceptionRequest Request {
host = "localhost"
port = 9411
secure = False
requestHeaders = [("content-type","application/json")]
path = "/api/v2/spans"
queryString = ""
method = "POST"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = ResponseTimeoutDefault
requestVersion = HTTP/1.1
}
(ConnectionFailure Network.Socket.connect: <socket: 54>: does not exist (Connection refused))
Agreed - the current behavior is not great. The background thread should fail the whole process on error (relevant read) or continue publishing.
In the meantime here are a couple suggestions to work around this:
- Call
publishmanually with adequate error handling. - (Untested) Specify a custom request manager which retries on a subset of exceptions.
The background thread should fail the whole process on error (relevant read) or continue publishing.
Right, the only viable option in case of backend daemons is to carry on with (or without) delayed retrying to send the same payload again, as failing the entire process is hardly desirable. How about performing another forkIO with a retrying-only closure upon receiving a network exception? The number of retries could then be configured similarly to settingsPublishPeriod.