sentry-cli Investigate reducing memory usage of chunk uploading

Chunk uploading appears to have a very large memory usage when uploading large files, since we appear to be reading the entire file contents into memory. For instance, the (currently experimental) Proguard chunk uploading feature greatly increases memory usage versus when the experimental chunk uploading feature is turned off. While profiling an upload of a 160 MB Proguard file, I observed memory usage of >300 MB while chunk uploading, vs ~90 MB with chunk uploading disabled.

As this problem likely affects all chunk uploads, not just Proguard files, we should investigate if we can reduce memory usage, e.g. by not reading the entire files into memory.

Dec 03 '24 13:12 szokeasaurusrex

Apparently chunk uploading's large memory usage is causing problems for the Ingest team. Thanks @IanWoodard and @asottile-sentry for bringing this to my attention.

Jan 02 '25 15:01 szokeasaurusrex

just so it's written down -- my hypothesis is it might not be the chunk uploading -- but the preprocessing / loading of the debug files themselves. though probably the easiest way to get to the bottom of it would be to use some rust memory profiler to better understand where the overhead is happening

in ~theory a proper streaming uploader shouldn't need much more than the chunksize of total memory to perform a successful upload (on the order of MBs ideally) though we're running out of multi-GB headroom at the moment

Jan 02 '25 16:01 asottile-sentry

With the memory profiler, I discovered that what appears to be eating memory is indeed the chunk uploading process. The reason that we consume much more than the chunk size amount of memory is that we are sending at once multiple 8 MiB chunks (we group multiple chunks into max 32 MiB requests). Additionally, due to how the curl library works, we need to allocate twice the amount of memory per request (64 MiB). Also, we use 8 threads to concurrently upload multiple requests, yielding a theoretical memory usage of 64 MiB * 8 = 512 MiB. In reality, we use even more than this (I observed close to 900 MB in my tests, while uploading a ~500 MB file), likely due to overheads.

We can reduce memory usage greatly by reducing the request and chunk size. Setting both to 1 MiB yielded a maximum memory usage of ~130 MB in my testing (again, with a ~500 MB file), with only a slight slowdown

Jan 07 '25 11:01 szokeasaurusrex

During an offline discussion with @jan-auer and @mitsuhiko, we decided that we will not adjust the chunk and request sizes from their current settings, due to the high risk associated with changing these and because the more robust solution would be to simply stream the chunks, avoiding the need to ever load them into memory in the first place.

Although curl provides an API to stream data into a multipart form (which is what we currently use to send the chunks), there is no Rust binding for this functionality. Because of this, we have decided to migrate the chunk uploading api code from curl to reqwest, which does support streaming multipart forms.

We will divide this work into two stages:

Move the current chunk uploading request logic from curl to reqwest, without changing how the upload is performed (i.e. we will still load the entire chunks into memory as we currently do).
Once we are using reqwest, refactor the code so that we stream the chunks, rather than reading them entirely into memory.

We may want to consider moving the rest of the code to reqwest long term. However, this would be a separate initiative; work on migrating anything other than chunk uploads to reqwest is out of scope for this initiative.

Jan 15 '25 14:01 szokeasaurusrex

Or – perhaps an easier option (thanks @mitsuhiko for pointing this out) would be to write our own logic to generate the multipart form (logic should be trivial). Then, we can stream this with curl.

We can still consider changing to reqwest, but this would now be a completely separate task.

Jan 15 '25 15:01 szokeasaurusrex

Is there any way to log the memory usage of uploading? I've spent a ridiculous amount of time trying to figure out why some chunks of my debugging files are not being uploaded and--log-level: trace has not been helpful in diagnosing the issue. (It simply shows that there are 5 chunks to send and then at the end, an error as 2 chunks are missing.) It would be useful to see logging relating to memory usage. Is that something that could be added sooner rather than waiting for this migration from curl to reqwest?

Jan 16 '25 07:01 xinsight

The memory usage doesn't seem to be related to the chunk uploading but as @asottile-sentry guessed, it's just how sentry-cli loads the entire debug files into memory before processing.

To confirm this, set up some dummy data files:

dd if=/dev/zero of=.dist/10mb bs=1024 count=10240
dd if=/dev/zero of=.dist/100mb bs=1024 count=102400
dd if=/dev/zero of=.dist/200mb bs=1024 count=204800

Now run debug-files check with memory profiling:

/usr/bin/time -l sentry-cli debug-files check .dist/10mb 2>&1 | grep max
            17907712  maximum resident set size
/usr/bin/time -l sentry-cli debug-files check .dist/100mb 2>&1 | grep max
           112754688  maximum resident set size
/usr/bin/time -l sentry-cli debug-files check .dist/200mb 2>&1 | grep max
           217710592  maximum resident set size

As you can see the memory usage grows with the file size. On my macbook processes have no memory limit, and sentry-cli has no problems uploading chunks. My guess is that there is a memory limit when running from a github action so the limit is reached and this is not being caught/logged by sentry-cli (oh the irony...)

Jan 16 '25 11:01 xinsight

@xinsight thank you for writing in.

Regarding your first comment, we do not plan to add any logging regarding memory usage, since there are external tools you can use to monitor memory usage.

Regarding the second comment, I see you are using the debug-files check command, not the debug-files upload command, which is going to use a completely different code path from what we use for uploading. Nevertheless, I did some investigating here, and it appears that debug-files check memory-maps the debug files. My understanding is that the "maximum resident set size" output by the time command can include any portion of the memory-mapped files that get loaded into memory. However, this memory should not count towards any memory limit, because it is simply a cache of the file still on disk, and because the amount of the file loaded into memory is OS and system dependent (memory-mapping is an OS abstraction).

I confirmed with another memory profiler (cargo instruments), that sentry-cli debug-files check does not allocate any memory for the file.

Jan 16 '25 13:01 szokeasaurusrex

Although, @xinsight – if you have been experiencing an error uploading debug files, which you have been having trouble debugging, please feel free to open a bug report issue. If possible, please provide a reproduction when opening the new issue.

Jan 16 '25 13:01 szokeasaurusrex