powertools-lambda-java icon indicating copy to clipboard operation
powertools-lambda-java copied to clipboard

Feature request: Priming for `powertools-serialization`

Open phipag opened this issue 5 months ago • 3 comments

Use case

PLEASE READ: Priming documentation: https://github.com/aws-powertools/powertools-lambda-java/blob/main/Priming.md

Parent issue: https://github.com/aws-powertools/powertools-lambda-java/issues/1588 Sample PR: https://github.com/aws-powertools/powertools-lambda-java/pull/1861

Java CRaC can be used to prime an application by implementing beforeCheckpoint() and afterRestore() hooks in selected classes. When used with AWS Snapstart, the beforeCheckpoint() hook runs before the memory snapshot is taken. This behavior can be leveraged to further reduce restore durations by pre-loading classes and calling commonly used code to incorporate this into the memory snapshot.

  1. Class-preloading / automatic priming: Based on a statically generated classesloaded.txt file, loads classes used at runtime into memory
  2. Invoke priming: Execute commonly used code e.g. to initialize all reflectively access objects, warm up caches, TCP connection pools etc. Common examples include performing "dry" AWS SDK calls (not impacting production resources) or performing JSON serialization/deserialization.

The goal of this issue is to implementing priming techniques for the powertools-serialization module.

Solution/User Experience

Implement priming based on Priming Documentation (feel free to update documentation and suggest improvements):

  • [ ] Automatic priming
  • [ ] Invoke priming

Alternative solutions


Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

phipag avatar Aug 04 '25 10:08 phipag

Hi @phipag! 👋

I'd be happy to help implement priming support for the powertools-serialization module. I've analyzed the module structure and believe this is an excellent candidate for priming implementation, especially given that JSON serialization/deserialization is mentioned in the priming documentation as a prime example of operations that benefit from warming up.

My proposed implementation approach:

  1. Add Maven profile to powertools-serialization/pom.xml for generating the classes loaded file (similar to the generate-classesloaded-file profile in powertools-metrics)

  2. Generate and clean up the classesloaded.txt file by:

    • Running tests with the VM argument -Xlog:class+load=info:classesloaded.txt
    • Processing the file to extract only the fully qualified class names
    • Moving it to src/main/resources/
  3. Implement CRaC Resource interface in JsonConfig class (the singleton configuration class) by:

    • Adding CRaC dependency and imports

    • Registering with Core.getGlobalContext() in a static block

    • Implementing beforeCheckpoint() to:

      • Call ClassPreLoader.preloadClasses() for automatic priming
      • Perform "dry" JSON serialization/deserialization operations to warm up Jackson ObjectMapper and JMESPath components
      • Initialize commonly used AWS Lambda event types (APIGatewayProxyRequestEvent, SQSEvent, etc.)
  4. Invoke priming examples for serialization module:

    • Warm up ObjectMapper with sample event deserialization
    • Initialize JMESPath function registry
    • Pre-load reflection metadata for common AWS event types

Why JsonConfig is the ideal candidate:

  • It's a singleton that manages ObjectMapper and JMESPath configuration
  • It's used throughout the serialization operations
  • JSON operations heavily benefit from class preloading and initialization
  • It follows the same pattern as MetricsFactory in the reference implementation

This implementation should significantly reduce cold start times for functions that use event deserialization, which is a common pattern in serverless applications.

Would you like me to proceed with this implementation? I can create a PR that follows the established patterns while adding serialization-specific priming optimizations.

Also, are there any specific AWS Lambda event types or serialization scenarios you'd recommend prioritizing for the invoke priming phase?

Thanks for making this accessible as a good first issue! 🚀

dcabib avatar Aug 29 '25 16:08 dcabib

Hey @dcabib,

thanks. This plan sounds great!

I think it makes sense to invoke prime all default events supported by the serialization module in order to fill the Jackson Mapper cache:

import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
import com.amazonaws.services.lambda.runtime.events.APIGatewayV2HTTPEvent;
import com.amazonaws.services.lambda.runtime.events.ActiveMQEvent;
import com.amazonaws.services.lambda.runtime.events.ApplicationLoadBalancerRequestEvent;
import com.amazonaws.services.lambda.runtime.events.CloudFormationCustomResourceEvent;
import com.amazonaws.services.lambda.runtime.events.CloudWatchLogsEvent;
import com.amazonaws.services.lambda.runtime.events.KafkaEvent;
import com.amazonaws.services.lambda.runtime.events.KinesisAnalyticsFirehoseInputPreprocessingEvent;
import com.amazonaws.services.lambda.runtime.events.KinesisAnalyticsStreamsInputPreprocessingEvent;
import com.amazonaws.services.lambda.runtime.events.KinesisEvent;
import com.amazonaws.services.lambda.runtime.events.KinesisFirehoseEvent;
import com.amazonaws.services.lambda.runtime.events.RabbitMQEvent;
import com.amazonaws.services.lambda.runtime.events.SNSEvent;
import com.amazonaws.services.lambda.runtime.events.SQSEvent;
import com.amazonaws.services.lambda.runtime.events.ScheduledEvent;

I understand this might need some work to generate fake JSON data for these events.

Let me know if I should assign the issue to you? 🚀

phipag avatar Sep 01 '25 09:09 phipag

This task is still in the backlog, if anyone bumps into it and is interested in working on it please review the conversation above and leave a comment so we can assign it to you.

dreamorosi avatar Sep 22 '25 12:09 dreamorosi

@dreamorosi I will be happy to pick this up

Attyuttam avatar Dec 22 '25 05:12 Attyuttam

Hi @Attyuttam, that's great! I'm assigning the issue to you - if you have any questions please let us know!

dreamorosi avatar Dec 29 '25 10:12 dreamorosi

Hi @dreamorosi , I went through the module and I think the EventDeserializer is the entry point to this module and the crac implementation should be done here. I am already working on the issue https://github.com/aws-powertools/powertools-lambda-java/issues/2004 , so based on that these are the action items I would be taking to resolve this issue:

  1. Add the profile to generate classesloaded.txt as mentioned in the priming documentation
  2. Parse the classloaded.txt file and move it to the src/main/resources folder
  3. Will register the crac resource in EventDesrializer class
  4. Write a test for the before and after checkpoint methods in EventDeserializerTest

Please suggest, if this seems to be the correct course of action. Thanks !

Attyuttam avatar Jan 05 '26 04:01 Attyuttam

Hey @Attyuttam,

this sounds like a good approach. Let's move forward with EventDeserializer as the entry-point and you can progress like in the PR you already sent for powertools-tracing.

This would complete the automatic priming task.

If you want to go further and implement invoke priming as well, you can look at my comment here. If you can invoke extractDataFrom such that all event objects from the AWS SDK are covered we will already have a warm Jackson reflection cache for all the object types leading to faster first time-deserialization for all these events when priming is enabled.

phipag avatar Jan 05 '26 10:01 phipag