aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

Humongous allocation with CRT client when downloading chunks of files

Open jasonstack opened this issue 2 months ago • 5 comments

Describe the bug

When downloading a 4mb chunk from a large file, we were seeing humongous allocation from native code. CRT client was passing heap bytes byte[] bodyBytesIn to the caller.

Can we make it offheap?

Version: aws-crt:0.33.9

Lowering the part size helps to allocate smaller bytes, but still heap pressure.

We are using 8mb as part size. Will lowering it to 512kb affecting partial download or full file downloading performance?

class S3MetaRequestResponseHandlerNativeAdapter {

    int onResponseBody(byte[] bodyBytesIn, long objectRangeStart, long objectRangeEnd) {
        return this.responseHandler.onResponseBody(ByteBuffer.wrap(bodyBytesIn), objectRangeStart, objectRangeEnd);
    }

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

Expected Behavior

Offheap buffer should be used

Current Behavior

Heap buffer is used

Reproduction Steps

Download 4mb chunk from a large file.

        GetObjectRequest request = GetObjectRequest.builder()
                .bucket(bucket)
                .key(rkey)
                .range(String.format("bytes=%d-%d", pos, pos+size-1))
                .build();

        return s3Async.getObject(request, asyncResponseTransformerImp);

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

aws-crt:0.33.9

JDK version used

JDK22

Operating System and version

Ubuntu 22.04 LTS. Also reproduced on Mac M1.

jasonstack avatar Nov 14 '25 01:11 jasonstack

Hi @jasonstack thank you for reaching out. I'm trying to better understand your use case.

Are you manually downloading 4mb chunks of a file, then manually combining the chunks to form the file back?

  1. If the answer is yes, have you tried using S3TransferManager? S3TransferManager downloadFile (and other APIs too, see the documentation) will automatically use multipart operations when used with the CRT-based S3 client :-
    FileDownload downloadFile = transferManager.downloadFile(downloadFileRequest);
  1. If you can use s3Async.getObject(request, Path) instead of using asyncResponseTransformer, this will write the file directly to disk and greatly improve memory usage. This was released in SDK version 2.32.11.

  2. Which version of the Java SDK are you using? aws-crt:0.33.9 is fairly old, so I'm assuming you're using an old version of the Java SDK as well. We improved the memory management in S3 operations in recent releases, so I recommend you test a more recent SDK version (2.38.7 is the latest) to see if you still experience the high memory allocation.

Let us know if you have questions about the suggestions above.

If you upgrade to a more recent version and still see the memory issue, please provide a self-contained SSCCE code example we can use to reproduce, and a heap dump.

debora-ito avatar Nov 14 '25 23:11 debora-ito

Thanks for the reply.

Are you manually downloading 4mb chunks of a file, then manually combining the chunks to form the file back?

Not combining the chunks.

If you can use s3Async.getObject(request, Path) instead of using asyncResponseTransformer, this will write the file directly to disk and greatly improve memory usage.

Right, this might help to reduce memory pressure. Initially, the plan was to serve the buffer before writing to disk to reduce some latency. It's worth a try.

Which version of the Java SDK are you using? aws-crt:0.33.9 is fairly old, so I'm assuming you're using an old version of the Java SDK as well. We improved the memory management in S3 operations in recent releases, so I recommend you test a more recent SDK version (2.38.7 is the latest)

it's s3:2.31.18.

According to the latest CRT native adapter code, it's still passing byte[] from native code. I don't think it avoids humongous allocation.

jasonstack avatar Nov 16 '25 14:11 jasonstack

Right, this might help to reduce memory pressure. Initially, the plan was to serve the buffer before writing to disk to reduce some latency. It's worth a try.

Let us know if writing to disk works for your use case.

If you want us to look further into the memory issue, please send us a repro code and heap dump.

debora-ito avatar Nov 17 '25 21:11 debora-ito

Hi @debora-ito,

I'm a colleague of @jasonstack, and I'd like to try to rephrase the issue a bit.

The main context is that we do would like to use the s3Async.getObject(request, transformer) API and not s3Async.getObject(request, Path). I don't want to focus too much on why: the API exists, so surely it's meant to be used. Suffice to say that we need to have access to the retrieved bytes in memory, regardless of whether we do or do not also write them to disk.

Further, we care about efficiency, so we try to follow S3's best practices, and use fairly large byte-range requests, say 8MB. In particular, the S3 best practice whitepaper says "Typical sizes for byte-range requests are 8 MB or 16 MB" (page 5), and I believe S3TransferManager also use 8MB ranges under the hood for that reason.

The issue is, if you follow those best practices with the CRT client and the s3Async.getObject(request, transformer) API, then it results in the client allocating contiguous 8MB byte arrays on heap, which is considered humongous allocations for any sanely configured JVM. And humongous allocations are far from optimal from a GC POV.

This isn't specific to our use case, this is how the code works: s_on_s3_meta_request_body_callback in s3_client.c always allocate a JNI byte array for the full response body, and calls the S3MetaRequestResponseHandlerNativeAdapter#onResponseBody method mentioned in the desciption only once with that full 8MB byte array.

Note that this isn't an issue with Netty based client, as it relies on Netty's default HttpClientCodec which process the response in chunks of 8kb.

To be clear, for our own use case, we can and are working around this, by switching from the CRT client to the Netty-based one in particular (and yes, s3Async.getObject(request, Path) is another work-around where appropriate).

But I admit that "following documented best practices with a public API leads to notoriously bad GC behavior" feels like a bug to me, so we figured this might be worth reporting.

As a small aside, since the CRT client uses native code, we were initially (a bit naively) hoping that the response bytes would be passed to the transformer using a direct ByteBuffer over the network buffer bytes. Which would bypass this humongous allocation issue (even if the body is passed to the transformer as a single buffer). And performance wise, this would be great. This is the context of the "Can we make if offheap" ask of the description. In hindsight, we understand that API wise this would make it easy to have use-after-free issues and is probably not an option. Still, we'd love and would use such an option if it was available (even if that meant promising to not use the ByteBuffer of Consumer#onNext after the call returns within the transfer).

pcmanus avatar Nov 18 '25 10:11 pcmanus

CRT was primarily designed from perspective of wanting to get multi GB transfers to go as quickly as possible. And it does some tradeoffs to support that. Primary one is that CRT will burn mem to support scaling to multiple parallel transfers. What this means on native side is allocating multiple part size buffers that are all written in parallel (ignoring the buffer reuse, etc.. for simplicity sake). Those buffers are delivered across jni boundary as soon as data is available in order and since on native side its already a part sized buffer, it will give java a part sized buffer.

We are aware that passing big buffers cross jni is not ideal behavior and it is one area that we are actively working on to eliminate the need to create huge java arrays. It sounds like in your case, you are interested in writing data directly to file and if the crt client could just write directly to disk instead of passing data to Java side, which then writes to file, it would have side stepped your memory concern?

DmitriyMusatkin avatar Dec 10 '25 22:12 DmitriyMusatkin

It sounds like in your case, you are interested in writing data directly to file

Not quite. We have cases where we're essentially applying some transformation on the downloaded bytes. We do happen to write the resulting bytes on disk ultimately, but the point is that it's not direct to file (but we have multiple use cases, and some of them are multi GB transfers directly to file, and for those we're using the CRT client successfully).

We are aware that passing big buffers cross jni is not ideal behavior and it is one area that we are actively working on to eliminate the need to create huge java arrays.

Nice. Our goal with this issue was mostly to say that we did run into the consequence on this current non-ideal behavior, and I guess that it surprised us. We had no similar issue with the Netty-based client, and saw no prior warning that this might be an issue switching to the CRT one (maybe this is documented somewhere and we missed it though).

But we have switched back to the Netty-based client for now and aren't blocked on this. If you're tracking elsewhere the improvement to "eliminate the need to create huge java arrays", feel free to close this ticket.

pcmanus avatar Dec 11 '25 08:12 pcmanus

Note: that i think in the netty flow where it is streaming to mem, it will default to only having 1 http connection open. So its a tradeoff between perf and mem usage - netty flow is much better from mem perspective, but is potentially slower, while crt approach can achieve much higher bandwidth, but mem usage is non-ideal now. Ill leave the decision on whether to keep this open or not to Java team. We are internally tracking potential approaches to improve java-crt integration, but i dont think there is anything on GH now.

DmitriyMusatkin avatar Dec 11 '25 19:12 DmitriyMusatkin