cloud-sdk-java icon indicating copy to clipboard operation
cloud-sdk-java copied to clipboard

Cache evection of HTTP Client lead to not closed HttpClients

Open j-denner opened this issue 10 months ago • 4 comments

Describe the Bug

The current implementation of caching and evicting an HTTP Client lead to not closed HttpClients.

https://github.com/SAP/cloud-sdk-java/blob/4fb1f6a0a4a2517a10dc4b02c69a6f072d98db8a/cloudplatform/connectivity-apache-httpclient5/src/main/java/com/sap/cloud/sdk/cloudplatform/connectivity/DefaultApacheHttpClient5Cache.java#L42-L47

Technically the HttpClients are CloseableHttpClient and should / need(specifically for pooled connections) to be closed after usage. (See also this discussion)

In the current code-base, when a HttpClient with a PoolingHttpClientConnectionManager is created, it's not cleaned up properly on eviction.


The following workaround does not work in all cases:

        cache = Caffeine.newBuilder().expireAfterAccess(duration, unit).ticker(ticker).evictionListener((key, value, cause) -> {
            if (value instanceof CloseableHttpClient closeableHttpClient) {
                try {
                    closeableHttpClient.close();
                } catch (final Exception e) {
                    log.warn("Failed to close HttpClient. Ignoring the exception and continue.", e);
                }
            }
        }).build();

There are two basic scenarios:

  1. The client was created and is no longer in use and the eviction time triggers the cleanup. :white_check_mark:
  2. The client was created and in still in use (long running operation, async operation, ...) and the eviction time triggers the cleanup. Then the evictionListener would kill the connection underneath. :x:

Steps to Reproduce

Code review.

Expected Behavior

Proper closing of HttpClients.

Screenshots

No response

Used Versions

Current state in main.

Code Examples

// Your code here

Stack Trace

No response

Log File

Log file ...

Affected Development Phase

Development

Impact

No Impact

Timeline

No response

j-denner avatar Mar 19 '25 11:03 j-denner

  • I don't see why/how the cache should handle lifecycle of its stored object.
  • Cache eviction does not mean garbage collection or I/O handling. It only means the reference will not be accessible via JCache API anymore. The object may still live healthy and busy, referenced in other objects. Any behavior change implicitly/automatically via cache is risking unpredictable race conditions.
  • With the current information provided, the issue linked by your post is assumed to be caused by InputStream not being closed for past HTTP responses.

newtork avatar Mar 19 '25 11:03 newtork

Presumably, the linked ticket assumes the Apache HttpClient 4, which requires the manual connection closing / entity consumption implicitly. Your link above (DefaultApacheHttpClient5Cache) assumes Apache HttpClient 5, which should close all connections automatically immediately - if I understood the API contract correctly.

To reduce the uncertainty, I would ask you to please provide code samples, what exact API you are using and how.

newtork avatar Mar 19 '25 12:03 newtork

Actually the problem is present in 4 and 5. And the contract for CloseableHttpClient is: The one who creates it, is the responsible one for closing it. This does not happen due to handing over to a cache that does not take care.

Cache eviction does not mean garbage collection or I/O handling. It only means the reference will not be accessible via JCache API anymore. The object may still live healthy and busy, referenced in other objects. Any behavior change implicitly/automatically via cache is risking unpredictable race conditions.

I know that very well. I I do not mean GC. Actually it seems you mean finalization and I'm talking about closing resources and not reclaiming memory.

However that means the last one that gets the HttpClient would be responsible for closing. However, that defeats the API contract of your cache.

You can see that, by answering the question who calls close() on a CloseableHttpClient? You will not find anyone in the code and this should give you a hint that something is wrong.

j-denner avatar Mar 19 '25 12:03 j-denner

Hi Jürgen,

Thanks for the explanation and making us aware of the API and object-lifecycle implications. We definitely will consider alternative API options, once we go for another major version. Or if the plan to add another HttpClientFooBarAccessor, which technically would not be a breaking-change for the current major release versioning.


https://gist.github.com/newtork/643667efdd842dba93635e4eba821586

I was trying to explore worst-case scenarios in practice, but I'm struggling to identify threads being left open: The thread count remains static (JRE17).

However when looking at the stats for 1000+ sequential HTTP requests, I do notice memory accumulating and being continuously garbage collected. I did not explore a memory dump, but from what I can tell when looking at the numbers there is no memory leak. Specifically when running garbage collector manually, I see no increase over time after 10_000 requests.

newtork avatar Mar 20 '25 12:03 newtork