ignite icon indicating copy to clipboard operation
ignite copied to clipboard

Possible memory leak

Open weetyre opened this issue 2 years ago • 8 comments

I deployed ignite with k8s and allocated 8G memory to it, but the memory usage is up to 90%. I suspect there are two possibilities: one is that the business really needs more direct memory, and the other is that there is really a memory leak. If it's a memory leak, is it an existing problem?

weetyre avatar Feb 20 '24 01:02 weetyre

the error msg is here:

2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)** ... 24 more Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800) at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?] at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?] at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

weetyre avatar Feb 20 '24 01:02 weetyre

k8s pod may have 8G memory, but the JVM is limited to 2.4G (limit: 2411724800).

Try adjusting Xmx/Xms JVM settings: https://ignite.apache.org/docs/latest/perf-and-troubleshooting/memory-tuning

ptupitsyn avatar Feb 20 '24 11:02 ptupitsyn

Hi, ptupitsyn: our ignite run with k8s pod. pod have 8G memory and guest host memory is 64G. Our use defaultDataRegionConfiguration, but haven't set the maxsize memory for defaultDataRegionConfiguration. Native persistence. I see the sorce code:" /** Fraction of available memory to allocate for default DataRegion. */ private static final double DFLT_DATA_REGION_FRACTION = 0.2;

/** Default data region's size is 20% of physical memory available on current machine. */
public static final long DFLT_DATA_REGION_MAX_SIZE = Math.max(
    (long)(DFLT_DATA_REGION_FRACTION * U.getTotalMemoryAvailable()),
    DFLT_DATA_REGION_INITIAL_SIZE);
"

the interface U.getTotalMemoryAvailable()-->sunOs.getTotalPhysicalMemorySize(), will get the guest host memory 64G 64G*0.2=12.8G. Although JVM limited the -XX:MaxDirectMemorySize=2300m. But ignite doesn't know it, only know maxsize is 12.8G. So ignite will continue request the memory, until occur OutOfMemoryError. is it right?

shuangquansq avatar Mar 06 '24 10:03 shuangquansq

It is not about data regions or persistence, the error indicates that you hit the JVM heap limit. Try changing Xmx to a higher value.

Also, 8G RAM is too small for an Ignite node in production. Please check the document I've linked above. A good starting point is -Xms10g -Xmx10g, and you need more RAM for data regions, I'd say at least 32G in total per pod.

ptupitsyn avatar Mar 06 '24 12:03 ptupitsyn

hi, ptupitsyn the error log: Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)**, is direct memory for off-heap, not he heap memory. Our jvm paramter: -XX:MaxRAMPercentage=35.0 -Xms1g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=512m -XX:MaxDirectMemorySize=2300m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1ReservePercent=15 -XX:InitiatingHeapOccupancyPercent=45 -XX:+ExplicitGCInvokesConcurrent -XX:+AlwaysPreTouch -XX:AutoBoxCacheMax=20000 -XX:+ScavengeBeforeFullGC -XX:+UseStringDeduplication -XX:MaxHeapFreeRatio=40

-XX:MaxDirectMemorySize=2300m is equal 2411724800. The memory usage usual over 90%. 80.9=7.2G。
jvm heap:8G
35%=2.8G jvm meta:128+512=640M=0.64G jvm directMemory=2.3G Total is 5.74. The memory 7.2-5.74=1.46G, we don't know where it used. Maybe happen memory leak. Thanks.

shuangquansq avatar Mar 07 '24 07:03 shuangquansq

There are no known memory leaks in Ignite. Please adjust JVM settings as explained above.

ptupitsyn avatar Mar 07 '24 07:03 ptupitsyn

the error msg is here:

2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)** ... 24 more Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800) at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?] at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?] at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

I also encountered a similar problem with you, I was deployed on a Linux host single node ignite, also appeared out of mem. But I feel that my heap memory is useless, all with the heap memory, according to the error with the obvious heap memory is not enough, I am very confused.

op-family avatar Mar 29 '24 02:03 op-family

the error msg is here: 2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)** ... 24 more Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800) at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?] at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?] at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

I also encountered a similar problem with you, I was deployed on a Linux host single node ignite, also appeared out of mem. But I feel that my heap memory is useless, all with the heap memory, according to the error with the obvious heap memory is not enough, I am very confused.

So I feel that xms setting is so large that it is useless, mainly to set-XX:MaxDirectMemorySize very large. -Xms2G -Xmx4G -XX:MaxDirectMemorySize=6G is much better

op-family avatar Mar 29 '24 02:03 op-family