Memory leaks in converse's cldb
Original issue: https://charm.cs.illinois.edu/redmine/issues/1202
Running valgrind on netlrts-linux-x86_64 examples reveals two memory leaks in initialization routines. One is in src/conv-ldb/cldb.c line 360, the other is in src/conv-core/cputopology.C line 183.
==13005== 32 bytes in 1 blocks are possibly lost in loss record 60 of 215
==13005== at 0x4C27AAA: malloc (vg_replace_malloc.c:291)
==13005== by 0x53FC64: CmiAlloc (convcore.c:3035)
==13005== by 0x545102: CldModuleGeneralInit (cldb.c:360)
==13005== by 0x5413A0: ConverseCommonInit (convcore.c:3791)
==13005== by 0x53D822: ConverseInit (machine-common-core.c:1261)
==13005== by 0x4914C6: main (main.C:18)
==13005==
==13005== 32 bytes in 1 blocks are possibly lost in loss record 61 of 215
==13005== at 0x4C28222: operator new[](unsigned long) (vg_replace_malloc.c:384)
==13005== by 0x54FB7F: cpuTopoRecvHandler(void*) (cputopology.C:183)
==13005== by 0x53F3BC: CsdSchedulePoll (convcore.c:1783)
==13005== by 0x54E3E4: LrtsInitCpuTopo (cputopology.C:582)
==13005== by 0x4945E2: _initCharm(int, char**) (init.C:1393)
==13005== by 0x53D92D: ConverseInit (machine-common-core.c:1294)
==13005== by 0x4914C6: main (main.C:18)
Original date: 2017-02-06 19:02:11
Fix for cputopology mem leak: ~~https://charm.cs.illinois.edu/gerrit/#/c/2202/~~ https://github.com/UIUC-PPL/charm/commit/3a01bf56a793fd10a9bed631dfae7ad24b731c50
The mem leak in cldb is just something allocated at init that should be explicitly deleted at exit. Doing this would require creating an explicit clean or finalize function that should be called by Converse at exit.
Original date: 2017-06-27 02:48:52
I see memory leaks from the new topology code merge in the last couple weeks. I'm not sure if they are really new, or if they just have new names...
Original date: 2017-06-28 21:03:49
Here's the new output:
==25396== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==25396== at 0x5756183: __sendto_nocancel (syscall-template.S:81)
==25396== by 0x608A2A: TransmitImplicitDgram1 (machine-eth.c:200)
==25396== by 0x608DA0: TransmitDatagram (machine-eth.c:285)
==25396== by 0x609CC8: CommunicationServerNet (machine-eth.c:734)
==25396== by 0x60A110: LrtsAdvanceCommunication (machine.c:1707)
==25396== by 0x6055A2: AdvanceCommunication (machine-common-core.c:1317)
==25396== by 0x605820: CmiGetNonLocal (machine-common-core.c:1487)
==25396== by 0x60C4E7: CsdNextMessage (convcore.c:1781)
==25396== by 0x60C835: CsdSchedulePoll (convcore.c:1972)
==25396== by 0x622743: LrtsInitCpuTopo (cputopology.C:593)
==25396== by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25396== by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25396== Address 0x5ae40c5 is 21 bytes inside a block of size 76 alloc'd
==25396== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25396== by 0x601DD1: malloc_nomigrate (libmemory-default.c:724)
==25396== by 0x60E96A: CmiAlloc (convcore.c:2939)
==25396== by 0x622694: LrtsInitCpuTopo (cputopology.C:580)
==25396== by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25396== by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25396== by 0x60555C: ConverseRunPE (machine-common-core.c:1296)
==25396== by 0x60547A: ConverseInit (machine-common-core.c:1198)
==25396== by 0x528B47: main (main.C:18)
==25395== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==25395== at 0x5756183: __sendto_nocancel (syscall-template.S:81)
==25395== by 0x6088F6: TransmitImplicitDgram (machine-eth.c:174)
==25395== by 0x608C7F: TransmitDatagram (machine-eth.c:265)
==25395== by 0x609CC8: CommunicationServerNet (machine-eth.c:734)
==25395== by 0x60A110: LrtsAdvanceCommunication (machine.c:1707)
==25395== by 0x6055A2: AdvanceCommunication (machine-common-core.c:1317)
==25395== by 0x605820: CmiGetNonLocal (machine-common-core.c:1487)
==25395== by 0x60C4E7: CsdNextMessage (convcore.c:1781)
==25395== by 0x60C835: CsdSchedulePoll (convcore.c:1972)
==25395== by 0x622743: LrtsInitCpuTopo (cputopology.C:593)
==25395== by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25395== by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25395== Address 0x5af8d05 is 21 bytes inside a block of size 64 alloc'd
==25395== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25395== by 0x601DD1: malloc_nomigrate (libmemory-default.c:724)
==25395== by 0x60E96A: CmiAlloc (convcore.c:2939)
==25395== by 0x60593F: CopyMsg (machine-common-core.c:1579)
==25395== by 0x603A94: SendSpanningChildren (machine-broadcast.c:117)
==25395== by 0x603B12: SendSpanningChildrenProc (machine-broadcast.c:176)
==25395== by 0x603BAD: CmiSyncBroadcastFn1 (machine-broadcast.c:219)
==25395== by 0x603C87: CmiFreeBroadcastAllFn (machine-broadcast.c:290)
==25395== by 0x621EAF: cpuTopoHandler(void*) (cputopology.C:288)
==25395== by 0x60D6F9: CmiSendReduce (convcore.c:2446)
==25395== by 0x60E196: CmiHandleReductionMessage (convcore.c:2627)
==25395== by 0x60C3D9: CmiHandleMessage (convcore.c:1672)