lambda-perf icon indicating copy to clipboard operation
lambda-perf copied to clipboard

Java snapstart issue

Open ledoux7 opened this issue 2 years ago • 8 comments

I don't think the current updating function configuration trick makes it so your tests are actually using snapstart based on the numbers. It possibly takes some time before AWS has made the snapstart feature enabled on newly updated functions

image

ledoux7 avatar Dec 04 '23 13:12 ledoux7

Interesting! I will test with a sleep to see if that affects the number. Thanks for reporting it @ledoux7

maxday avatar Dec 04 '23 15:12 maxday

I could not reproduce, when you say It possibly takes some time before AWS has made the snapstart feature enabled on newly updated functions, are you referring to the docs or any other official source? Thanks!

maxday avatar Dec 07 '23 15:12 maxday

I don't have any offical source unfortunately, but from my personal experience with using snapstart (with Kotlin) it only seems to work consistently if we do a fresh deploy from cdk. So changing something like an env variable through the console doesn't trigger the rebuild of the snapshot or something. I haven't dug that deep into it, but thought it only was an issue of time.

Could you maybe add a log line to the init part of the handler to confirm if it is run or not? It should not be run if snapstart is used.

ledoux7 avatar Dec 07 '23 17:12 ledoux7

Or the problem might be it does not invoke the correct version, list-versions-by-function seems to be returning the $LATEST tag also. https://awscli.amazonaws.com/v2/documentation/api/latest/reference/lambda/list-versions-by-function.html#description

And I'm not really understanding if this is doing what it should, would not the new versions of the lambda not be deployed by the time you call let arns = lambda_manager.list_versions_by_function(runtime).await?; and you then loop over the first 10 versions of this lambda which also includes the $LATEST tag?

async fn invoke_snapstart<'a>(
    runtime: &Runtime,
    retry: &RetryManager,
    lambda_manager: &LambdaManager<'a>,
) -> Result<(), Error> {
    let arns = lambda_manager.list_versions_by_function(runtime).await?;
    for i in 0..10 {
        info!("snapstart run #: {}", i);
        if let Some(arn) = arns.get(i) {
            info!("arn = {}", arn);
            retry
                .retry_async(|| async {
                    lambda_manager
                        .update_function_configuration(&runtime.function_name())
                        .await
                })
                .await?;
            info!("function updated to ensure cold start");
            thread::sleep(Duration::from_secs(5));
            retry
                .retry_async(|| async { lambda_manager.invoke_function(arn).await })
                .await?;
            info!("function invoked");
        }
    }
    Ok(())
}

A better approach could be to create an alias named SnapStart and then updating what version that points to each time you update the function. And just invoking that alias everytime.

ledoux7 avatar Dec 07 '23 17:12 ledoux7

im gonna second that. put 2 different versions of the same code on the lambda (e.g. version 1 and 2 that you can swap between), set the container concurrency to 1 lambda at a time and use an alias. then, to be completely sure you have started a new container, you can change the version the alias is using and invoke against that alias afterwards.

you'll still want to sleep 5 to ensure your next invoke isn't waiting for the previous container to be killed.

croconut avatar Dec 29 '23 22:12 croconut

I just came here to see what the issue with Java snapstart was as it seemed broken 😅

Was anyone able to test if the above would resolve the issue and report correct snapstart times?

beeradmoore avatar Nov 19 '24 03:11 beeradmoore

I thought snapstart was about subsequent starts, so I dont see how it would ever be faster booting the initial (cold timings) as it's doing the work it was doing, plus a bit. Where I'd expect to see it shine is the warm boots and perhaps scaling to n

Lewiscowles1986 avatar Oct 21 '25 20:10 Lewiscowles1986

I thought snapstart was about subsequent starts, so I dont see how it would ever be faster booting the initial (cold timings) as it's doing the work it was doing, plus a bit. Where I'd expect to see it shine is the warm boots and perhaps scaling to n

Snapstart takes a snapshot/image at deployment time. First call restores this image. From my observations this is faster than launching a standard JVM and loading the application classes (even a simple hello world). Interesting to note that the image is associated with a specific version (the one created at deployment time). If you don't invoke this specific version you don't use the image.

mcampora avatar Nov 21 '25 13:11 mcampora