Fail to retrieve token for high latency connection
If I connect using a high latency, satellite connection, the AWS SDK cannot retrieve a token. This problem started with 1.11.678. I have not found a configuration to increase the timeout for the underlying operation. Can one be added?
In my case, I have a simple Spring Boot application, using Spring Cloud, with AWS SQS. By default, that pulls in the 1.11.415 version. We had trouble with connections not getting properly closed and needed to upgrade AWS to prevent an open files leak. Although this was fixed, it introduced the token retrieval issue.
Stack trace
2020-06-17 09:08:22.518 level=WARN thread="pool-1-thread-16" c.a.i.InstanceMetadataServiceResourceFetcher - Fail to retrieve token
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:69)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:166)
at com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.getCredentials(EC2ContainerCredentialsProviderWrapper.java:75)
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1225)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1246)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:2207)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2174)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2163)
at com.amazonaws.services.sqs.AmazonSQSClient.executeReceiveMessage(AmazonSQSClient.java:1607)
at com.amazonaws.services.sqs.AmazonSQSAsyncClient$14.call(AmazonSQSAsyncClient.java:1055)
at com.amazonaws.services.sqs.AmazonSQSAsyncClient$14.call(AmazonSQSAsyncClient.java:1049)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Network is unreachable: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.
Environment
- AWS Java SDK version used: 1.11.791
- JDK version used: 1.8
- Operating System and version: Windows 10.0.18363
Hi @craigsmithmsp the problem started in 1.11.678 because it's when the new Instance Metadata Service v2 was released, we have seen various reports of increased latency on the service side (like https://github.com/aws/aws-sdk-java/issues/2276 and https://github.com/aws/aws-sdk-java-v2/issues/1667).
Unfortunately is not possible to change the underlying connectionTimeout, I can mark this as a feature request if you'd like. You can also try to add a custom retry logic since the SDK won't retry IMDS credentials fetching.
Thank you, @debora-ito . Please mark it as a feature request. The Spring cloud does repeatedly retry without success. I have noticed on our EC2 instances that we sometimes get it on startup but it retries and resolves quite reliably.
HI all, I facing the same logging right now after updating aws-sdk to 1.11.807. Just for my understanding ... it's not a real problem right? Because I have a running local springboot-service which is fetching data from s3 and it works ... even if I see this logging. Would be ok to reduce the loglevel to ERROR?
It appears that ConnectionUtils has hard-coded connect & read timeouts set to 1s: https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/internal/ConnectionUtils.java#L41
The Python SDK appears to obey an environment variable AWS_METADATA_SERVICE_TIMEOUT https://boto3.amazonaws.com/v1/documentation/api/1.9.42/guide/configuration.html#environment-variable-configuration but the Java SDK doesn't appear to have anything like that.
Hi! I talked about this issue and described our custom solution in this article.
Any update on this issue? Facing the same Problem.
Unfortunately is not possible to change the underlying connectionTimeout, I can mark this as a feature request if you'd like. You can also try to add a custom retry logic since the SDK won't retry IMDS credentials fetching.
Can this issue be closed? From looking at the latest code it looks like the java SDK now reads AWS_METADATA_SERVICE_TIMEOUT since 1.12.40, so the timeout is now configurable:
https://github.com/aws/aws-sdk-java/blob/8045d3dda6a4390516012fbc05ece5de13eba862/aws-java-sdk-core/src/main/java/com/amazonaws/internal/ConnectionUtils.java#L43-L65