QAT_Engine [QAT_MEM} page allocation failure issues

Hi !

A number of kernel panic messages were found in syslog while using the QAT module. (occurs in qat_mem.ko) qat_mem_ioctl seems to have failed to allocate page.

It looks like a system hang due to a memory leak when checking panic messages. Do you have any reports about this issue? Do you know what to do if you have a reported history?

[7730502.783337] page allocation failure: order:5, mode:0x24000c0 [7730502.783343] CPU: 18 PID: 22411 Comm: ssl_proxy Tainted: G OE 4.4.0-116-generic #140-Ubuntu [7730502.783345] Hardware name: Micro-Star International Co., Ltd. KT-S145/KT-S145, BIOS 5.11 08/25/2017 [7730502.783348] 0000000000000286 7aa48184a81eb87b ffff880643953be8 ffffffff813ffc13 [7730502.783352] 00000000024000c0 0000000000000000 ffff880643953c78 ffffffff81198bca [7730502.783354] 7aa48184a81eb87b 0000000000000005 0000000000000040 ffff88086a521c00 [7730502.783357] Call Trace: [7730502.783367] [] dump_stack+0x63/0x90 [7730502.783374] [] warn_alloc_failed+0xfa/0x150 [7730502.783378] [] ? __alloc_pages_direct_compact+0x56/0x130 [7730502.783381] [] __alloc_pages_slowpath.constprop.88+0x48d/0xb00 [7730502.783384] [] __alloc_pages_nodemask+0x288/0x2a0 [7730502.783388] [] alloc_pages_current+0x8c/0x110 [7730502.783390] [] __get_free_pages+0xe/0x40 [7730502.783396] [] qat_mem_ioctl+0x98/0x2f0 [qat_mem] [7730502.783401] [] do_vfs_ioctl+0x2af/0x4b0 [7730502.783407] [] ? vfs_write+0x149/0x1a0 [7730502.783409] [] SyS_ioctl+0x79/0x90 [7730502.783416] [] entry_SYSCALL_64_fastpath+0x1c/0xbb [7730502.783418] Mem-Info: [7730502.783424] active_anon:1253040 inactive_anon:157184 isolated_anon:0 [7730502.783424] active_file:2610404 inactive_file:2484265 isolated_file:0 [7730502.783424] unevictable:914 dirty:1926 writeback:0 unstable:0 [7730502.783424] slab_reclaimable:246302 slab_unreclaimable:36859 [7730502.783424] mapped:802283 shmem:825087 pagetables:31498 bounce:0 [7730502.783424] free:96382 free_pcp:0 free_cma:0 [7730502.783429] Node 0 DMA free:15884kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15968kB managed:15884kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [7730502.783435] lowmem_reserve[]: 0 1808 15886 15886 15886 [7730502.783439] Node 0 DMA32 free:65040kB min:5084kB low:6352kB high:7624kB active_anon:31836kB inactive_anon:10556kB active_file:19616kB inactive_file:12368kB unevictable:80kB isolated(anon):0kB isolated(file):0kB present:1971340kB managed:1890576kB mlocked:80kB dirty:20kB writeback:0kB mapped:2300kB shmem:5524kB slab_reclaimable:118696kB slab_unreclaimable:7020kB kernel_stack:400kB pagetables:6560kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no [7730502.783446] lowmem_reserve[]: 0 0 14078 14078 14078 [7730502.783449] Node 0 Normal free:67724kB min:39608kB low:49508kB high:59412kB active_anon:3389268kB inactive_anon:404044kB active_file:4766456kB inactive_file:4591508kB unevictable:1692kB isolated(anon):0kB isolated(file):0kB present:14680064kB managed:14416236kB mlocked:1692kB dirty:152kB writeback:0kB mapped:3052676kB shmem:3045544kB slab_reclaimable:397164kB slab_unreclaimable:97956kB kernel_stack:3120kB pagetables:89612kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [7730502.783455] lowmem_reserve[]: 0 0 0 0 0 [7730502.783458] Node 1 Normal free:236880kB min:45368kB low:56708kB high:68052kB active_anon:1591056kB inactive_anon:214136kB active_file:5655544kB inactive_file:5333184kB unevictable:1884kB isolated(anon):0kB isolated(file):0kB present:16777216kB managed:16513148kB mlocked:1884kB dirty:7532kB writeback:0kB mapped:154156kB shmem:249280kB slab_reclaimable:469348kB slab_unreclaimable:42460kB kernel_stack:3008kB pagetables:29820kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:168 all_unreclaimable? no [7730502.783463] lowmem_reserve[]: 0 0 0 0 0 [7730502.783466] Node 0 DMA: 14kB (U) 18kB (U) 016kB 032kB 264kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15884kB [7730502.783478] Node 0 DMA32: 31294kB (UME) 27438kB (UME) 180216kB (UM) 832kB (H) 664kB (H) 3128kB (H) 1256kB (H) 0512kB 01024kB 02048kB 04096kB = 64572kB [7730502.783488] Node 0 Normal: 174324kB (UME) 158kB (UE) 016kB 032kB 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 69848kB [7730502.783496] Node 1 Normal: 175494kB (UME) 99878kB (UME) 405316kB (UME) 23832kB (UM) 24564kB (UM) 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 238236kB [7730502.783506] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [7730502.783508] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [7730502.783510] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [7730502.783511] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [7730502.783512] 5923366 total pagecache pages [7730502.783514] 2924 pages in swap cache [7730502.783516] Swap cache stats: add 3053297, delete 3050373, find 189484473/190114645 [7730502.783517] Free swap = 15232380kB [7730502.783518] Total swap = 15625212kB [7730502.783519] 8361147 pages RAM [7730502.783521] 0 pages HighMem/MovableOnly [7730502.783522] 152186 pages reserved [7730502.783522] 0 pages cma reserved [7730502.783523] 0 pages hwpoisoned [7730502.783525] do_ioctl: __get_free_pages() failed

Kind Regards,

Aug 21 '18 10:08 duadbsgh

Hi @duadbsgh,

The issue you are seeing is not uncommon. That is one of the reasons why the qat_mem/qat_contig_mem is not considered a production ready driver.

The qat_mem/qat_contig_mem driver requests kernel memory via the __get_free_pages() call. The driver asks for 128KB contiguous slabs. As this memory is coming out of the general system allocation, it works well to start with, but over time the systems memory can become fragmented (by other usage) leading eventually to failed requests. It is not that memory is leaking, it is just the amount of available 128KB contiguous slabs is reducing over time until none are available.

If possible it is recommended to use the USDM driver that is included with the CPM 1.7 upstream driver. The USDM driver has far more configurable options to help prevent this. Firstly the USDM driver by default allocates 2MB slabs rather than 128KB slabs, this may seem counter intuitive but it uses them more efficiently so does overall reduce the memory consumption as long as it can get 2MB allocations. If 2MB slabs are not working well you can configure the USDM driver to use 128KB slabs instead like the qat_mem/qat_config_mem driver uses. Additionally you can configure how the USDM driver caches slabs, In this way you can reduce whether slabs get released back to the kernel to be used by other applications. If you set the threshold of slabs high it will mean memory usage will remain high when traffic is low but will decrease the risk of fragmentation as nothing else can use the memory. The other thing is that the USDM driver supports using Hugepages. You can configure the Linux Kernel to set aside an amount of memory at boot time that will only be used by applications making use of Hugepages. In this way if only USDM is using Hugepages, and Hugepages are configured with 2MB slabs there will be no fragmentation as USDM (without reconfiguration) always requests/releases 2MB slabs. Obviously there is a tradeoff as the memory you set aside for Hugepages cannot be used for anything else. There is also the issue that if you allocate too little Hugepages then you can still end up exhausting the slabs if utilisation rises too high.

Unfortunately I do not have any documentation on how exactly how to configure these options for the USDM Driver. I'm not sure if it is documented as part of the QAT 1.7 Driver documentation or whether you will need to look at the source code. If you are interested I can look into the details further.

Hope this additional information is helpful,

Steve.

Aug 21 '18 11:08 stevelinsell

For info on using USDM with Hugepages see the following section: 3.16 Huge Pages with the Included Memory Driver of the QAT 1.7 Programmers Guide.

Kind Regards,

Steve.

Aug 21 '18 15:08 stevelinsell

Hi @stevelinsell Thank you for answer.

We are currently unable to use USDM. Is there any way to resolve this in qat_mem? Sorry.

Kind Regards,

Aug 22 '18 09:08 duadbsgh

Hi @duadbsgh,

I don't have a 100% solution for you unfortunately.

One thing you could try is within qae_mem_utils.c (one of the qat engine files) there is a #define: #define MAX_EMPTY_SLAB 128 This define controls how many empty slabs are cached before they are released back to the kernel. By increasing this value you will increase the memory usage of the QAT Engine when it is not busy, but it will also mean slabs are kept hold of rather than released, preventing that memory from becoming fragmented by other use. Setting it very high will prevent any slabs being released until the application exits. This will help in a scenario where traffic bursts a lot over time between high and low, as if the high point happens early on when slabs are available they will be kept. It will not help in a scenario where traffic stays low for a long period initially, memory gets fragmented, and then a burst happens, as there will still not be slabs available. Any time that a new high point in traffic is reached then new slabs will be requested, and the error could still occur.

Additionally you could also make some code changes to qae_mem_utils.c to allocate a large amount of slabs up front at application start up, and by changing the #define mentioned above prevent those slabs getting released. This would help in both scenarios above, but at a large cost always in memory usage.

Also consider the scenario where you want to restart the application without restarting the whole system. In that case the memory could already be fragmented and none of the workarounds suggested above will be of any benefit. It is those kind of scenarios where hugepages are really useful. I do not have a version of qat_mem/qat_contig_mem that will make use of hugepages unfortunately, but if you really needed it, it maybe possible for you to port the hugepage support from the USDM driver.

I hope some of that might be useful to you,

Steve.

Aug 22 '18 22:08 stevelinsell

@stevelinsell is there a reason that physically contiguous memory is required? Is the QAT accelerator not behind an IOMMU?

Mar 01 '23 19:03 DemiMarie