aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

Auto priming support

Open samdengler opened this issue 2 years ago • 9 comments

Describe the feature

At CapitalOne, we've observed notable performance benefits when priming AWS SDK clients in conjunction with Lambda SnapStart snapshotting, however the practice of artificially constructing AWS SDK clients and invoking classes during the Lambda initialization phase is nonintuitive for engineers to discover and implement. If there was an option to auto-prime AWS SDK clients, we can more easily ensure that engineers are following a best practice by default and realize a better performance experience with the Lambda Java runtime when using the AWS SDK for Java 2.0

Use Case

I'm frustrated when I need to include artificial code, like DynamoDBAsyncClient.describeEndpoints, in during Lambda function initialization that constructs and invokes AWS SDK client classes to be included in the Firecracker snapshot for a performance benefit during invocation of the Lambda handler.

Proposed Solution

A JAVA_TOOLS configuration for AWS SDK for Java 2.0 client would be an easy mechanism for our engineers to manage auto-priming across compute using the SDK, however our focus is on Lambda SnapStart.

Other Information

No response

Acknowledgements

  • [ ] I may be able to implement this feature request
  • [ ] This feature might incur a breaking change

AWS Java SDK version used

2.19.26

JDK version used

openjdk 11.0.17 2022-10-18 LTS

Operating System and version

Java 11 Lambda runtime

samdengler avatar Feb 28 '23 15:02 samdengler

Hi @samdengler, thanks for reaching out. I did some experiments with exposing a warmUp (naming subject to change of course) API on the SDK client that preloads all SDK classes, some Jackson classes if it is a JSON based service. Customers can also provide an optional list of prime functions to invoke in case they want to warm up the connection pool. Would this work for your use case? Feedback is welcome!

The code looks like below:

DynamoDbClient client = DynamoDbClient.create();
WarmUpConfiguration configuration = WarmUpConfiguration.builder()
                                                       .initializeClasses(true)
                                                       .preloadClasses(true)
                                                       .primeFunctions(client::listTables)
                                                       .build();
client.warmUp(configuration);

zoewangg avatar Mar 01 '23 00:03 zoewangg

Hi @zoewangg, thanks for the quick reply. What would the most conventional/default experience be? Something like:

DynamoDbClient client = DynamoDbClient.create();
WarmUpConfiguration configuration = WarmUpConfiguration.builder().build();
client.warmUp(configuration);

Also, any thoughts to the approach that X-Ray uses? https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java-awssdkclients.html

samdengler avatar Mar 01 '23 19:03 samdengler

this would be an awesome feature to have!

einarjohnson avatar Mar 07 '23 18:03 einarjohnson

We need this!

Berehulia avatar Mar 22 '23 14:03 Berehulia

@zoewangg I wonder if there's a different approach where one can set a global SDK config, so that all newly created clients are primed by default?

I do note that you've broken the warming up into different pieces here e.g. initializeClasses, etc. A global setting might be more abt the initializeClasses and preloadClasses part but not the primeFunctions part as it seems to be client specific.

I feel like the global option might be beneficial in cases where the user is using a large number of AWS SDK Clients

humanzz avatar Mar 23 '23 22:03 humanzz

This should also be expanded to other AWS Services as well.

What we found is that even if the clients are set at Singleton, the first call (without any priming done) still takes a long time. So having this auto priming feature would be really helpful especially for cold-start scenarios.

keeed avatar May 16 '23 17:05 keeed

Just to add to this as well since my use case is fairly similar, but rather than Lambda we're running on EKS.

In my scenario it's a pretty small Spring Boot app that has a low memory footprint (~30-40 MB heap used, 130 MB default heap size) and thus the Kubernetes pods only request / limit the resources to 0.25 CPU and 256 MB RAM. One thing we've noticed it that several SDK libraries are really slow on the first invocation, but become very fast subsequently.

Some examples of this are:

  • An initial STS call through AwsCredentialsProvider#resolveCredentials where the first (and active) WebIdentityTokenFileCredentialsProvider takes up to 3 seconds to get credentials.
  • The first call to SqsAsyncClient#sendBatch can take 1-3 seconds synchronously, with an extra 3-6 seconds after that happening asynchonously. The next 1-2 requests will take 30-100 ms, while subsequent requests only take about 2-3 ms.
  • The first call to DynamoDbTable#getItem takes about 300 ms, and the first call to DynamoDbTable#query will take another 300 ms. Subsequent calls take typically 5-20 ms, with outliers going up to 80 ms. I did some investigating into this and found that just constructing objects like Update or PutItemEnhancedRequest or running DynamoDbTable#query but not actually fetching any results (so no network call) each take about 100 ms.
  • All of these also trigger garbage collection because there's up to 4-9 thousand classes (some of this includes other libraries like Gson, but mostly this SDK) loaded by these calls as part of the first request to the server that utilizes all of these features. Doubling memory and increasing the minimum heap size also didn't help prevent garbage collection from being triggered by these initializations.

I managed to cut these out by essentially "priming" these SDKs by resolving credentials, doing a describe endpoints call against DynamoDB, generating all of my models and converting them into relevant Put / PutItemEnhancedRequest / SdkIterable<Page<T>> objects, calling SqsAsyncClient#sendMessage against a queue that doesn't exist, etc. This saves about 4-6 seconds on the first request, but it's a bit inconvenient to maintain this logic myself, and would be a nice to have as part of the SDK, so everything can be eagerly loaded as my container is starting as an optional aspect of the SDK client.

michaeldimchuk avatar Jul 20 '23 03:07 michaeldimchuk

This would be amazing!

TobyMellor avatar Jul 28 '23 18:07 TobyMellor

Any update on this?

luketn avatar May 21 '25 14:05 luketn