tsfile [CPP]. Memory Usage Analysis about TsFile CPP

Recently I analyzed the memory usage when using CPP to insert data with a Tablet. Based on the current implementation, there is still some work to be done on the memory management of CPP. For specific loads and memory requirements, we need to provide a suitable and accurate calculation method.

Aug 18 '24 07:08 ColinLeeo

EXPERIMENT

With the help of Valgrind and massif-visualizer. I got the result below.

BASIC INFO

50 devices * 50 measuements = 2500 timeseries. All data are stored in plain int32 without compression.

1. 25 rows in tablet, 300 points on every page, and flush at the program end.

img_v3_02dq_1e313937-2ac9-4804-97ec-fcb6feb2f65g

2. 10 rows in tablet, 100 points on every page, and flush at the program end.

img_v3_02dq_13e64915-d620-46f8-8381-9c4de4268a4g

3. 10 rows in tablet, 100 points on every page, and flushing when writing data.

img_v3_02dq_999ee066-b4e5-4bd8-bac9-b031a78ade0g

Aug 18 '24 08:08 ColinLeeo

From the screenshot above.

Due to the encapsulation of memory usage in the program, valgrind can only collect our underlying memory allocations and cannot determine where the memory is going.
The memory usage of the method that alternates between writing and flushing should be smaller, but there may be a situation where memory has not been released.

Aug 18 '24 08:08 ColinLeeo

When not calling flush manually, the memory usage is mainly controlled by chunk_group_size_threshold_, which is 128MB by default. It is pretty close to the result of 144MB above. Considering that other things also occupy memory (like metadata or the program itself), chunk_group_size_threshold_ seems to work fine enough.

When flushing after inserting each tablet, the memory consists mainly of metadata. We can do a simple calculation: in the last experiment, each tablet only had 10 rows, so a ChunkMetadata was generated for every 10 points of a time series. A ChunkMetadata is typically around 50-80 bytes. Therefore, each data point will consume 5-8 bytes on average, even after being flushed.

chunk_group_size_threshold_ sounds promising, but we may not rely on it. The reason is that the memory check is performed after each insertion. That is, if a Tablet is larger than chunk_group_size_threshold_, this parameter will not take effect. Anyway, the size of a tablet should be carefully controlled to hold the memory below a strict threshold, e.g., 10MB. If we set chunk_group_size_threshold_ to 10MB, and there are two tablets, 9MB each, then there will be 80% more memory used than we expected. As a result, we should still call flush after each insertion to perform strict memory control.

Additionally, we were using the wrong way to estimate the proper size of a Tablet. We previously used the number of time series (2500) and the point size (8+4 bytes) to divide the memory budget (10MB), but the 2500 timeseries come from 50 devices, and we flushed once for each device. The result is that the row count of a Tablet should have is significantly underestimated.

In conclusion, in the next experiments: 1. the flush-after-each-tablet policy should be continued; 2. row count in each Tablet should be recalculated, much higher than the current value; 3. after 2. is done, metadata still will accumulate in memory, but at a much slower speed; if it still has a major impact on memory, we should either switch to the next file after a certain number of flushes or implement the DiskTSMIterator which is provided in the Java Edition.

Aug 19 '24 02:08 jt2594838

Well. Some memory leak fixes at pr: https://github.com/apache/tsfile/pull/214

Aug 27 '24 05:08 ColinLeeo