questdb icon indicating copy to clipboard operation
questdb copied to clipboard

Memory usage is too high and not released in Java (embedded) mode.

Open xtadg opened this issue 1 year ago • 3 comments

To reproduce

  1. Start the application I built with Spring Boot 3, with JVM parameters set to -Xmx2G -Xms2G. image

  2. I will create globally CairoConfiguration,CairoEngine, and CairoEngine during the project startup. image

  3. There is a table with 20 million records, and I will perform several group aggregation statistics on this table based on different conditions.

  4. It will be observed that the JVM garbage collection is normal, but the RES usage shown by the top command is very high and does not get released even after garbage collection. image image

  5. I tried using NMT to check the memory situation, but did not find any issues. native_memory.txt

  6. I suspect that there might be some places in QuestDB where memory is not being released. Please help me.

  7. Because it is deployed alongside other programs, the memory usage cannot be so high, even if it means sacrificing some performance.

QuestDB version:

8.1.0

OS, in case of Docker specify Docker and the Host OS:

CentOS 8.2.2004

File System, in case of Docker specify Host File System:

ext4

Full Name:

xtadg

Affiliation:

Sugon

Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?

  • [X] Yes, I have

Additional context

No response

xtadg avatar Aug 19 '24 07:08 xtadg

I use pmap -x pid | sort -n -k3 , and found many files are occupying out-of-heap memory, causing high RES usage. Why don’t these files release after use? Is there a configuration that allows QuestDB to release it in a timely manner?" image

xtadg avatar Aug 23 '24 03:08 xtadg

What is your schema, and what queries are you trying to run?

nwoolmer avatar Aug 27 '24 16:08 nwoolmer

What is your schema, and what queries are you trying to run?

@nwoolmer Thank you for your reply,here is my schema

CREATE TABLE t_traffic_log (
      id long,
      src_zone symbol,
      dst_zone symbol,
      src_region symbol,
      dst_region symbol,
      src_ip long256,
      dst_ip long256,
      src_port int,
      dst_port int,
      proto symbol,
      app_name symbol,
      user_name symbol,
      sec_policy symbol,
      bandwidth_policy symbol,
      uplink_bytes long,
      uplink_pkts long,
      downlink_bytes long,
      downlink_pkts long,
      total_bytes long,
      total_pkts long,
      interface_in symbol,
      interface_out symbol,
      session_closed_reason symbol,
      vsys_name symbol index,
      timestamp timestamp
  ) TIMESTAMP(timestamp) PARTITION BY DAY BYPASS WAL

and here is my queries

# The variable x switches back and forth among the fields: src_zone, dst_zone, src_region, dst_region, src_ip, dst_ip, src_port, dst_port, proto, app_name, user_name, sec_policy, and bandwidth_policy, executing several rounds.
SELECT x,
       uplinkBytes / (select SUM(uplink_bytes) from t_traffic_log),
       downlinkBytes / (select SUM(downlink_bytes) from t_traffic_log),
       totalBytes / (select SUM(total_bytes) from t_traffic_log),
       sessionNum / (select COUNT() from t_traffic_log)
FROM (SELECT x,
             SUM(uplink_bytes)   AS uplinkBytes,
             SUM(downlink_bytes) AS downlinkBytes,
             SUM(total_bytes)    AS totalBytes,
             COUNT()             AS sessionNum
      FROM t_traffic_log
      GROUP BY src_addr)

xtadg avatar Aug 28 '24 13:08 xtadg

@nwoolmer Can you help me? As long as the memory is limited, I can even accept a decrease in query performance.

xtadg avatar Nov 06 '24 02:11 xtadg

@xtadg you need to call CairoEngine#releaseInactive() periodically to release inactive table readers and writers. Otherwise, they might keep mmapped and anonymous memory around. It's a good idea to take a look at how we use APIs in ServerMain which is the class responsible for running the database in standalone mode.

Also, if you need to check native memory usage, you can call dump_memory_usage(); query which will print grouped memory usage stats to the log. Alternatively, you can access this data programatically:

for (int i = MemoryTag.MMAP_DEFAULT; i < MemoryTag.SIZE; i++) {
    System.out.println("tag: " + MemoryTag.nameOf(i) + ", used bytes:" + Unsafe.getMemUsedByTag(i));
}

puzpuzpuz avatar Jan 08 '25 13:01 puzpuzpuz

@puzpuzpuz Thanks. I've tried adding CairoEngine#releaseInactive() at the end of my query function, but it still doesn't work. I used the code you provided to print out the RSS memory usage, image but the result is quite different from what I saw with the top command. image The following is the core code for the query I executed. image I really don't know where the problem lies.

xtadg avatar Jan 09 '25 08:01 xtadg

I've tried adding CairoEngine#releaseInactive() at the end of my query function, but it still doesn't work.

Table readers are only released after a timeout defined by CairoConfiguration#getInactiveReaderTTL(), so calling releaseInactive() immediately may not close readers (at least with PropServerConfiguration). You can try calling releaseAllReaders() instead of releaseInactive() to see if this changes anything (keep in mind that releaseAllReaders() is a test-only API, so it shouldn't be called in prod code).

As for top's RSS, as as as I recall it includes certain part of mmapped memory in that value (in-use, active pages), so you should also be checking MMAP_* memory stats.

puzpuzpuz avatar Jan 09 '25 12:01 puzpuzpuz

@puzpuzpuz Thank you very much. It's just as I guessed. I think that the readers would not be released immediately, so I added a scheduled task to call releaseInactive. After observation, the RSS occupation can be released successfully. Finally, this problem has been solved. Thanks.

xtadg avatar Jan 10 '25 01:01 xtadg