Client connection leak with chunked and compressed responses
Version: 3.0.7 TL/DR: dropwizard-default HTTP client leaks connections if the response is both chunked and compressed Reproduction steps: run this app: https://github.com/eddsteel/zipped-chunked
Hi folks,
I'm seeing a connection leak in the HTTP client of a dropwizard app when calling another dropwizard app if the response is both chunked and compressed. We noticed it in the kotlin-based dropwizard 3.0 client app when upgrading the kotlin-based dropwizard server from DW 2.1 to 3.0 but I think the issue could be older (I'll explain). It seems to be 100% of the request connections are held open (in our dev environment metrics we see a 1:1 correlation of requests to a specific endpoint and leased connections, until the client app falls over).
I have reproduced the issue in a Java application with DW 3.0.7, using a single application as both client and server. I've tried to simplify/remove as much of our code as possible leaving only dropwizard core.
More background: the connection leak was discovered in the client after upgrading the server. When I investigated locally I discovered that the "working" combination of 3.0 client and 2.1 server was not actually using gzip encoding, and the problem only appeared with the 3.0 server when the client was using compression. Configuring it not to use gzip also fixed the problem with connections leaking. Since the 2.1/3.0 combo never used gzip it's possible the problem would exist with 2.1 too.
Reproduction
Anyway, to focus on the reproducible case with the latest v3 DW as both client and server. It should behave the same way as our systems as far as client creation and chunked data handling go, but with a lot of layers removed. The app has two endpoints to simulate server and client. POST / produces a ChunkedOutput response (just one chunk). GET /check will call POST / 1025 times and read/parse the full body. If compression is enabled (via application config) the 1025th attempt will fail due to an exhausted connection pool. None of the connections have been released.
[nix-shell:~/src/zipped-chunked]$ java -Ddw.client.compress=false -jar target/webapp-1.0-SNAPSHOT.jar server config.yml >/dev/null &
[1] 28790
[nix-shell:~/src/zipped-chunked]$ curl http://localhost:8080/check
done
[nix-shell:~/src/zipped-chunked]$ kill 28790
[nix-shell:~/src/zipped-chunked]$ java -Ddw.client.compress=true -jar target/webapp-1.0-SNAPSHOT.jar server config.yml >/dev/null &
[nix-shell:~/src/zipped-chunked]$ curl http://localhost:8080/check
{"code":500,"message":"Connections exhausted"}
[nix-shell:~/src/zipped-chunked]$ curl -v http://localhost:8080 -d {}
* Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> POST / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: */*
> Content-Length: 2
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200 OK
< Date: Thu, 06 Jun 2024 16:44:07 GMT
< Content-Type: application/json
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
<
{"label":"test","integers":[1,2,3],"integerCount":3}
* Connection #0 to host localhost left intact
[nix-shell:~/src/zipped-chunked]$ curl --compressed -v http://localhost:8080 -d {}
* Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> POST / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: */*
> Accept-Encoding: deflate, gzip
> Content-Length: 2
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200 OK
< Date: Thu, 06 Jun 2024 16:44:52 GMT
< Content-Type: application/json
< Vary: Accept-Encoding
< Content-Encoding: gzip
< Transfer-Encoding: chunked
<
{"label":"test","integers":[1,2,3],"integerCount":3}
* Connection #0 to host localhost left intact
Investigation
As far as I can tell both the ChunkedOutput and ChunkedInput objects are correctly closed in the application code. The ChunkedInput with try with resources, and the ChunkedOutput directly. In the compressed=false case, the EOFSensorInputStream's EOF watcher's eofDetected method is called and the connection is released (here). It seems that when a GZIPInputStream is involved, -1 is never received and the connection is never released.
I took a trace dump from checkEOF for both versions and also logged each byte received from the breakpoint.
compressed
"pool-3-thread-7 - GET /check@4758" prio=5 tid=0x1b nid=NA runnable
java.lang.Thread.State: RUNNABLE
at org.apache.hc.core5.http.io.EofSensorInputStream.checkEOF(EofSensorInputStream.java:195)
at org.apache.hc.core5.http.io.EofSensorInputStream.read(EofSensorInputStream.java:119)
at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:59)
at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:266)
at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:258)
at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:164)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
at org.apache.hc.client5.http.entity.GZIPInputStreamFactory.create(GZIPInputStreamFactory.java:61)
at org.apache.hc.client5.http.entity.LazyDecompressingInputStream.initWrapper(LazyDecompressingInputStream.java:51)
at org.apache.hc.client5.http.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:57)
at org.glassfish.jersey.message.internal.EntityInputStream.read(EntityInputStream.java:69)
at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream.read(ReaderInterceptorExecutor.java:263)
at org.glassfish.jersey.client.ChunkedInput$AbstractBoundaryParser.readChunk(ChunkedInput.java:112)
at org.glassfish.jersey.client.ChunkedInput.read(ChunkedInput.java:471)
at com.eddsteel.resources.ClientResource.callServer(ClientResource.java:30)
This gets the following bytes: 31, 16, 139, 8, 0, 6, 57, which seems to only be the GZIP header from GZIPInputStream.readHeader in the constructor. However the application code is eventually getting the full body and parsing the JSON.
not compressed
This code gets the full body, ending with a -1.
"pool-3-thread-7 - GET /check@4741" prio=5 tid=0x1b nid=NA runnable
java.lang.Thread.State: RUNNABLE
at org.apache.hc.core5.http.io.EofSensorInputStream.checkEOF(EofSensorInputStream.java:197)
at org.apache.hc.core5.http.io.EofSensorInputStream.read(EofSensorInputStream.java:119)
at org.glassfish.jersey.message.internal.EntityInputStream.read(EntityInputStream.java:69)
at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream.read(ReaderInterceptorExecutor.java:263)
at org.glassfish.jersey.client.ChunkedInput$AbstractBoundaryParser.readChunk(ChunkedInput.java:112)
at org.glassfish.jersey.client.ChunkedInput.read(ChunkedInput.java:471)
at com.eddsteel.resources.ClientResource.callServer(ClientResource.java:30)
Bytes received: (the full expected JSON + -1): 123, 34, 108, 97, 98, 101, 108, 34, 58, 34, 116, 101, 115, 116, 34, 44, 34, 105, 110, 116, 101, 103, 101, 114, 115, 34, 58, 91, 49, 44, 50, 44, 51, 93, 44, 34, 105, 110, 116, 101, 103, 101, 114, 67, 111, 117, 110, 116, 34, 58, 51, 125, 13, 10, -1
I think either the way the streams are layered, or the behavior of LazyDecompressingInputStream, is causing the EofSensorInputStream to be bypassed.
Next steps
The code at fault seems to be within Apache HTTPComponents or possibly how Jersey configures it, but I wanted to ask here first since you're the experts in combining these libraries:
- Is this familiar to anyone?
- Is there anything wrong in the example code wrt ChunkedInput/ChunkedOutput or compression configuration?
- Should I be directly closing/ releasing connections somehow, and not relying on the EOF detection?
- Does it seem like a bug?
- Anything more I should do to isolate it?
- Should I report to Apache folks directly?
I've not had any luck with any of the other jersey connector providers so far (even making a request to this endpoint successfully), but it might be an option.
Hi @eddsteel. This seems to be a Jersey issue. As you've correctly pointed out, the Apache resources will get cleaned up when then entity stream gets closed.
The specification of Response#readEntity(GenericType<T>) states:
Unless the supplied entity type is an input stream, this method automatically closes the an unconsumed original response entity data stream if open.
You're not directly using an InputStream but the stream is exposed in the ChunkedInput and hence should be closed when closing the ChunkedInput. Read entities should be closed, if they are not instances of Closable or Source. However, ChunkedInput is Closable, so the call to close() correctly should be deferred to allow chunked reads. But the Jersey ReaderInterceptorExecutor disables closing the stream. So the resources of the Apache client cannot be released correctly.
I'd suggest to open a Jersey issue to expose the underlying entity input stream. This comment states why the close() method is forbidden at the moment, but when exposing an entity holding an input stream, the stream should be closable IMHO.
But you also have an option to work around your current problem: When using the Response object and reading the entity in a second call, you can utilize Response#close() to free the resources. This would look somewhat like this in your client implementation:
Response resp = client
.target(endpoint)
.request()
.post(Entity.entity("{}", MediaType.APPLICATION_JSON_TYPE));
ChunkedInput<EndpointResponse> entity = resp.readEntity(chunkedInputGenericType);
// ...
// after fully reading the entity
resp.close();
Thanks @zUniQueX for helping me understand the issue even though it's not really a dropwizard one. I see the gzip/EOF behaviour was just masking the real problem. I confirmed that closing the response directly, as well as the input, allows the resources to be released. We can apply the workaround to our production kotlin component too.
I'll raise this with the Jersey project as I think it's at least counter-intuitive, and needing to close both the body input and the response seems like a strange special case.
Thanks again!
Jersey released the fix with the versions 2.45, 3.0.15, 3.1.8, 4.0.0-M2.