powertools-lambda-java icon indicating copy to clipboard operation
powertools-lambda-java copied to clipboard

Feature request: Implement mechanism to update `classesloaded.txt` file for automatic priming

Open phipag opened this issue 5 months ago • 6 comments

Use case

PLEASE READ: Priming documentation: https://github.com/aws-powertools/powertools-lambda-java/blob/main/Priming.md

PR introducing class-preloading: https://github.com/aws-powertools/powertools-lambda-java/pull/1861.

This project uses class pre-loading to implement automatic priming to reduce AWS Snapstart restore duration. The class pre-loader reads the classesloaded.txt of a powertools module that implements automatic priming and attempts to load each class listed in this file before AWS Snapstart takes a memory snapshot. If a class is not found, it will be ignored (this is the case for test classes for example).

The goal of this issue is to design and implement a mechanism that keeps the classesloaded.txt file automatically up-to-date as the project and code in each module evolves. An individual contributor should not have any knowledge about AWS Snapstart or priming techniques when contributing a change to this project. This process should be as automated as possible.

Solution/User Experience

Idea (please suggest alternatives if you have another idea)

Create a GitHub workflow that runs when a merge to main branch happens.

Workflow Steps:

  1. Merge to Main - Trigger on push to main branch (after PR merge)
  2. Checkout Code - Get the latest main branch code
  3. Java Files Changed? - Check if any .java files were modified in the merge
  4. Identify Affected Powertools Modules - Determine which modules need updates
  5. Generate classesloaded.txt - Create the runtime classes file for each module
  6. Clean Files - Apply sed commands as per Priming documentation
    • Example sed command: sed 's/.*\[class,load\] \([^ ]*\) source:.*/\1/' classloaded.txt > classloaded_clean.txt
  7. Sort File Contents - Sort file contents to assure stable diffs
  8. Files Have Diff? - Check if generated files differ from existing ones
  9. Create Update PR - Create a new PR with the updated classesloaded.txt files (if there is a diff)
flowchart TD
    A[Merge to Main Branch] --> B[Checkout Code]
    B --> C{Java Files Changed?}
    C -->|No| D[Stop - No Action Needed]
    C -->|Yes| E[Identify Affected Powertools Modules]
    E --> F[Generate classesloaded.txt for Each Module]
    F --> G[Clean Affected Files Using sed Command]
    G --> H[Sort File Contents to Assure Stable Diffs]
    H --> I{Files Have Diff?}
    I -->|No| J[Stop - No Changes]
    I -->|Yes| K[Create New Branch]
    K --> L[Commit Changes to New Branch]
    L --> M[Create PR to Update classesloaded.txt]
    M --> N[End - PR Ready for Review]
    
    style A fill:#e1f5fe
    style D fill:#ffebee
    style J fill:#ffebee
    style N fill:#e8f5e8

Alternative solutions


Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

phipag avatar Aug 04 '25 10:08 phipag

Hi @phipag Great details. I am happy to look into this automation.

subhash686 avatar Aug 04 '25 12:08 subhash686

Hey @subhash686, thanks for engaging. This sounds awesome 🚀 .

Let me assign this issue to you and add it to our current iteration. Feel free to post questions and let me know if you need any assistance testing things.

phipag avatar Aug 04 '25 12:08 phipag

Hey @subhash686,

I made a small update to my initial design proposal. I think it is better that we do not directly commit into a PR opened by a contributor. Instead, we should trigger the workflow when a merge to the main branch happens and create a new PR automatically if needed. This can be reviewed separately by a maintainer. Similar to dependabot, but for classesloaded.txt.

phipag avatar Aug 05 '25 09:08 phipag

Hi @phipag and @subhash686, thanks for working on this. The priming pattern is super useful for customers working with Java on Lambda.

While the initial idea of mutating the PullRequest by regenerating files and including them works, I'm not a big fan of it and I think we should have an automation for this. Just for more context, we were doing something similar to this in Powertools Python and at some point all the PRs got modified with new files and it became hard to review PRs and understand responsibilities and who modified things.

That said, I think we have 2 options and I don't have preference:

1/ Create a new workflow that runs on_push to the main branch, detects the folders modified by this PR, and regenerates the classloaded.txt file.

2/ Create a workflow that runs every day at 9:00 AM, for example, and iterates over all files regenerating classloaded.txt file. In this case, you don't need to worry about detecting the changed code and will try best effort.

I'm not sure if running this script will change files that shouldn't be changed, but if not, you don't need to worry about sed/sha and other stuff, just check if some file has changed and stop the workflow if not: you can use git status --porcelain for that. But I'm might be wrong here.

Pls let me know if need any help with this workflow.

leandrodamascena avatar Aug 05 '25 09:08 leandrodamascena

Hey @subhash686,

let me know if you still like to work on this issue. After giving it some thought, I believe it might be hard for you to test the automation in a repository fork with GitHub actions. Potentially, for this one it is easier for the maintainers to work on it.

Let me know what you think. I can propose an initial draft of automation and we can review it together as well.

I also created a full list of priming related tasks as sub-issues here in case you would like to work on a different Snapstart priming topic: https://github.com/aws-powertools/powertools-lambda-java/issues/1588

phipag avatar Aug 19 '25 09:08 phipag

Hi @phipag I was wondering how much I could play with Github actions as a contributor. Happy to collaborate and review with you while you or other maintainers take care of it. Thanks.

subhash686 avatar Aug 21 '25 03:08 subhash686