okhttp icon indicating copy to clipboard operation
okhttp copied to clipboard

IOException with "Connection reset" message causes loss of requests

Open camiel-genesys opened this issue 1 year ago • 3 comments

Hi,

I have stumble upon an error scenario that is difficult to reproduce. I am using OkHttp3 (5.0.0-alpha.11) and Retrofit (2.9.0) with Java 11.

I use Retrofit with OkHttp3 on a backend service that makes quite a lot of post requests to another backend server using the asynchronous mode, but, although it works perfectly fine almost all the time, we had a few cases where the OkHttp3 client closed many of the active requests with the IOException with message "Connection reset" (sorry, I don't have the stacktrace). This, per se, is not an issue as we do retry these requests and the retries did succeed.

The real problem here is that a few post requests we created just a few hundred milliseconds after we got the above exception for the active requests, NEVER got a response. And when I say a response, it even includes a failure response such as an exception or a timeout. It was like these requests never existed. The problem is that the application rely on any response (successful or failure) to move on, and not getting anything is very bad.

I suspect this could be a sign of some kind of race condition somewhere. Unfortunately, I can't reproduce it locally...

I tried to understand what the "Connection reset" error meant, but I couldn't find documentation about that, even a search on the square libraries in GitHub didn't give me information.

OkHttp3 default client configuration:

        ConnectionSpec connectionSpec = new ConnectionSpec.Builder(ConnectionSpec.MODERN_TLS)
                .tlsVersions(TlsVersion.TLS_1_2, TlsVersion.TLS_1_3)
                .build();

        client = new OkHttpClient.Builder()
                .connectionSpecs(Arrays.asList(connectionSpec))
                .connectionPool(new ConnectionPool(2_000, 300, TimeUnit.SECONDS))
                .addInterceptor(getUserAgentHeaderInterceptor())
                .build();

        client.dispatcher().setMaxRequests(10_000);
        client.dispatcher().setMaxRequestsPerHost(10_000);

New client based on default client:

    protected OkHttpClient createHttpClient(OkHttpClient defaultClient, BaseClientConfig config) {
        OkHttpClient.Builder builder = defaultClient.newBuilder();

        builder.callTimeout(Duration.ofMillis(30_000));
        builder.connectTimeout(Duration.ofMillis(5_000));
        builder.readTimeout(Duration.ofMillis(30_000));
        builder.writeTimeout(Duration.ofMillis(10_000));

        return builder.build();
    }

Thanks, Camiel

camiel-genesys avatar Feb 07 '24 17:02 camiel-genesys

There are a bunch of fixes in 5.0.0-alpha.12, see https://github.com/square/okhttp/blob/master/CHANGELOG.md#version-500-alpha12

While I'm not suspecting it will fix it, it would be good to rule it out first.

yschimke avatar Feb 10 '24 15:02 yschimke

Yep, I upgraded our app recently with the latest version... Let's see if that helps...

Thanks, Camiel

camiel-genesys avatar Feb 12 '24 12:02 camiel-genesys

Yep, I upgraded our app recently with the latest version... Let's see if that helps...

Thanks, Camiel

hi,do you try i find it is also has the connection reset problem

feng612266 avatar Mar 25 '24 11:03 feng612266