Alternate Load-balancer IPs Not Used
Taking a look at ShufflingDNSResolver it seems like while we return an array of InetAddress only the first one will ever be used. I can't tell if the multi-threaded nature of the connection pool means MDC.put("mantaLoadBalancerAddress", addresses[0].getHostAddress()); will actually ever change (or be set differently in different threads). Some printf testing made it seem like the resolve(final String host) method is only called once when the client initially connects but I can't be sure it wouldn't be called if I waited long enough or needed to provoke a network issue.
We can pass additional parameters to the PoolingHttpClientConnectionManager constructor to indicate that resolve should be called occasionally. I verified this worked as expected even though I couldn't find where the decision to re-resolve occurs in the HttpClient library.
I'll open a strawman PR with a ridiculously low TTL so we can discuss what value to use (or how to calculate it).
Didn't this get fixed?
It was never actually confirmed that this was the source of the issue, reconfiguration of upstream networking equipment fixed the distribution of load. Additionally it was determined that the clients far outnumbered the load-balancers so #272 was abandoned.
You've described the IP usage pattern to me before (i.e. as errors occur the next one in the list will be utilized) so I think it might be better to file an issue with Apache HttpClient to make that behavior visible to consumers. Even with the changes in that PR there is still the possibility that some threads will record a null IP address. The steps I took to uncover this issue:
- create a client with the default connection pool configuration
- create more worker threads as available connections
- schedule all workers as quickly as possible
- observe logs. a mix of IPs will show up in the request logs, but some of those IPs will be null
I'll update this issue with a link to the HTTPCLIENT ticket once I can boil the scenario down to something using just PoolingHttpClientConnectionManager