node-lmdb icon indicating copy to clipboard operation
node-lmdb copied to clipboard

Double free issue when using a cursor

Open nwrbs opened this issue 7 years ago • 24 comments

I'm afraid this issue is probably scant on details, but here it goes. I'm getting an error in a program I wrote using node-lmdb. The error only occurs when I run the program, not when I debug the program, so I'm unable to tell where exactly in the program the error is occurring. The backtrace seems to indicate the issue is in CursorWrap...

I have 2 functions that I call 2 times each:

convertStringKeyTable(env, dbiA);
convertStringKeyTable(env, dbiB);
convertIntKeyTable(env, dbiC);
convertIntKeyTable(env, dbiD);

Both functions access the same env, but different dbi's. I can run each function individually, and I have no problem. The error only occurs when I call all four functions in a row. I can run any combination of the 4 (ie the first 3, the last 3, etc) and no error occurs. It is only when I call all four that the issue occurs.

The message from node is below. Is there anything more I can do to see where the error is coming from please let me know.

*** Error in `node': double free or corruption (top): 0x000000000279ff30 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f686c880499]
../node_modules/node-lmdb/build/Release/node-lmdb.node(_ZN10CursorWrap5closeERKN3Nan20FunctionCallbackInfoIN2v85ValueEEE+0x64)[0x7f686a5edb24]
../node_modules/node-lmdb/build/Release/node-lmdb.node(+0x219c7)[0x7f686a5ed9c7]
node(_ZN2v88internal25FunctionCallbackArguments4CallEPFvRKNS_20FunctionCallbackInfoINS_5ValueEEEE+0x193)[0xa5ff63]
node[0xad710c]
node(_ZN2v88internal21Builtin_HandleApiCallEiPPNS0_6ObjectEPNS0_7IsolateE+0xaf)[0xad7d5f]
[0x22b24cb842fd]

nwrbs avatar Jul 06 '18 22:07 nwrbs

I think I have located the possible cause. Reuse of a cursor variable: The functions contain the following:

let cursor = new lmdb.Cursor(txn, tblDbi, { keyIsBuffer: true });
...
cursor.close();
...
cursor = new lmdb.Cursor(txnTmp, dbiTmpTbl, { keyIsBuffer: true });
...
cursor.close();

If I create a separate cursor variable ("cursor2) for the 2nd use of a cursor, the issue does not occur. It's only if I reuse the 1st declaration of "cursor"

nwrbs avatar Jul 06 '18 23:07 nwrbs

@nwbrad Thanks for reporting this. Can you please run node-lmdb inside gdb to see a better backtrace? Also, do you use the latest version?

If you are unfamiliar with gdb, this is how to launch it:

gdb --args node your-app.js

Inside gdb use r to run your app and when it crashes use bt to produce a backtrace. Please paste the backtrace here.

Venemo avatar Jul 06 '18 23:07 Venemo

I would also appreciate it if you could give me a test case that reproduces this.

Venemo avatar Jul 06 '18 23:07 Venemo

below is the backtrace.

*** Error in `/usr/bin/node': double free or corruption (top): 0x000000000226de70 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7ffff6c48499]
/test/node_modules/node-lmdb/build/Release/node-lmdb.node(_ZN10CursorWrap5closeERKN3Nan20FunctionCallbackInfoIN2v85ValueEEE+0x64)[0x7ffff49b5b24]
/test/node_modules/node-lmdb/build/Release/node-lmdb.node(+0x219c7)[0x7ffff49b59c7]
/usr/bin/node(_ZN2v88internal25FunctionCallbackArguments4CallEPFvRKNS_20FunctionCallbackInfoINS_5ValueEEEE+0x193)[0xa5ff63]
/usr/bin/node[0xad710c]
/usr/bin/node(_ZN2v88internal21Builtin_HandleApiCallEiPPNS0_6ObjectEPNS0_7IsolateE+0xaf)[0xad7d5f]
[0x133a2f9842fd]

nwrbs avatar Jul 06 '18 23:07 nwrbs

@nwbrad Can you please repeat that but using a debug build? So that we can see line numbers in there. The backtrace itself is enough, I can't use the memory map.

Venemo avatar Jul 07 '18 06:07 Venemo

I ran debuginfo-install nodejs-8.11.3-1nodesource.x86_64 to install debug source, but I got an error message: Could not find debuginfo for main pkg: 2:nodejs-8.11.3-1nodesource.x86_64 Everything thing else on the install ran fine. Any idea on where to get the debuginfo for nodejs-8.11.3-1nodesource.x86_64

nwrbs avatar Jul 07 '18 15:07 nwrbs

@nwbrad What I would like to see is line numbers in node-lmdb. For that you don't need the nodejs debuginfo, it is enough to make a debug build of node-lmdb.

Venemo avatar Jul 08 '18 10:07 Venemo

I don't think this worked like expected because I don't see any line numbers for node-lmdb. I installed and built nodejs from source and then ran the following:

gdb --args node_g taxonomy_convert.js
#0  0x00007ffff6bfd277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6bfe968 in __GI_abort () at abort.c:90
#2  0x00007ffff6c3fd37 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff6d51d58 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  0x00007ffff6c48499 in malloc_printerr (ar_ptr=0x7ffff6f8d760 <main_arena>, ptr=<optimized out>, str=0x7ffff6d51e40 "double free or corruption (top)", action=3)
    at malloc.c:5025
#4  _int_free (av=0x7ffff6f8d760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3847
#5  0x00007ffff49b5b24 in CursorWrap::close(Nan::FunctionCallbackInfo<v8::Value> const&) ()
   from /node_modules/node-lmdb/build/Release/node-lmdb.node
#6  0x00007ffff49b59c7 in Nan::imp::FunctionCallbackWrapper () from /node_modules/node-lmdb/build/Release/node-lmdb.node
#7  0x00000000016bfc62 in v8::internal::FunctionCallbackArguments::Call (this=0x7fffffffbac0, 
    f=0x7ffff49b5950 <_ZN3Nan3impL23FunctionCallbackWrapperERKN2v820FunctionCallbackInfoINS1_5ValueEEE>) at ../deps/v8/src/api-arguments.cc:25
#8  0x0000000001775a93 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=0x3818b70, function=..., new_target=..., fun_data=..., receiver=..., args=...)
    at ../deps/v8/src/builtins/builtins-api.cc:112
#9  0x0000000001773e21 in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=0x3818b70) at ../deps/v8/src/builtins/builtins-api.cc:142
#10 0x0000000001773b93 in v8::internal::Builtin_HandleApiCall (args_length=4, args_object=0x7fffffffbec8, isolate=0x3818b70) at ../deps/v8/src/builtins/builtins-api.cc:130
#11 0x00002051c90043c4 in ?? ()
#12 0x0000006600000000 in ?? ()
#13 0x00002051c9004301 in ?? ()
#14 0x00007fffffffbe80 in ?? ()
#15 0x0000000000000006 in ?? ()
#16 0x00007fffffffbf70 in ?? ()
#17 0x00002051c90f153f in ?? ()
#18 0x0000074cf66022d1 in ?? ()
#19 0x000029731aa077e9 in ?? ()
#20 0x0000000400000000 in ?? ()
#21 0x0000226624a23661 in ?? ()
#22 0x0000074cf66022d1 in ?? ()
#23 0x0000074cf66022d1 in ?? ()
#24 0x0000074cf66022d1 in ?? ()
#25 0x0000074cf66022d1 in ?? ()
#26 0x0000226624a23631 in ?? ()
#27 0x000034fbf0544271 in ?? ()
#28 0x0000226624a23611 in ?? ()
#29 0x000029731aa077e9 in ?? ()
#30 0x0000000000000000 in ?? ()

nwrbs avatar Jul 08 '18 21:07 nwrbs

I installed and built nodejs from source

But you don't need to. You only need a debug build of node-lmdb, not node.

git clone [email protected]:Venemo/node-lmdb.git
cd node-lmdb
node-gyp configure --debug
node-gyp build --debug

Then use the resulting binaries from this build.

Venemo avatar Jul 08 '18 21:07 Venemo

Ok. Tried above and got the following for line 3

node-gyp configure --debug
gyp info it worked if it ends with ok
gyp info using [email protected]
gyp info using [email protected] | linux | x64
gyp http GET https://nodejs.org/download/release/v8.11.3/node-v8.11.3-headers.tar.gz
gyp http 200 https://nodejs.org/download/release/v8.11.3/node-v8.11.3-headers.tar.gz
gyp http GET https://nodejs.org/download/release/v8.11.3/SHASUMS256.txt
gyp http 200 https://nodejs.org/download/release/v8.11.3/SHASUMS256.txt
gyp info spawn /usr/bin/python2
gyp info spawn args [ '/usr/lib/node_modules/node-gyp/gyp/gyp_main.py',
gyp info spawn args   'binding.gyp',
gyp info spawn args   '-f',
gyp info spawn args   'make',
gyp info spawn args   '-I',
gyp info spawn args   '/nodelmdb/node-lmdb/build/config.gypi',
gyp info spawn args   '-I',
gyp info spawn args   '/usr/lib/node_modules/node-gyp/addon.gypi',
gyp info spawn args   '-I',
gyp info spawn args   '/home/xyz/.node-gyp/8.11.3/include/node/common.gypi',
gyp info spawn args   '-Dlibrary=shared_library',
gyp info spawn args   '-Dvisibility=default',
gyp info spawn args   '-Dnode_root_dir=/home/xyz/.node-gyp/8.11.3',
gyp info spawn args   '-Dnode_gyp_dir=/usr/lib/node_modules/node-gyp',
gyp info spawn args   '-Dnode_lib_file=/home/xyz/.node-gyp/8.11.3/<(target_arch)/node.lib',
gyp info spawn args   '-Dmodule_root_dir=/nodelmdb/node-lmdb',
gyp info spawn args   '-Dnode_engine=v8',
gyp info spawn args   '--depth=.',
gyp info spawn args   '--no-parallel',
gyp info spawn args   '--generator-output',
gyp info spawn args   'build',
gyp info spawn args   '-Goutput_dir=.' ]
module.js:549
    throw err;
    ^

Error: Cannot find module 'nan'
    at Function.Module._resolveFilename (module.js:547:15)
    at Function.Module._load (module.js:474:25)
    at Module.require (module.js:596:17)
    at require (internal/module.js:11:18)
    at [eval]:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:50:33)
    at Object.runInThisContext (vm.js:139:38)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:652:30)
    at evalScript (bootstrap_node.js:466:27)
gyp: Call to 'node -e "require('nan')"' returned exit status 1 while in binding.gyp. while trying to load binding.gyp
gyp ERR! configure error 
gyp ERR! stack Error: `gyp` failed with exit code: 1
gyp ERR! stack     at ChildProcess.onCpExit (/usr/lib/node_modules/node-gyp/lib/configure.js:345:16)
gyp ERR! stack     at emitTwo (events.js:126:13)
gyp ERR! stack     at ChildProcess.emit (events.js:214:7)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:198:12)
gyp ERR! System Linux 3.10.0-862.6.3.el7.x86_64
gyp ERR! command "/usr/bin/node" "/usr/bin/node-gyp" "configure" "--debug"
gyp ERR! cwd /nodelmdb/node-lmdb
gyp ERR! node -v v8.11.3
gyp ERR! node-gyp -v v3.7.0
gyp ERR! not ok 

nwrbs avatar Jul 08 '18 21:07 nwrbs

I think I have it now:

gdb) bt
#0  0x00007ffff6bfd277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6bfe968 in __GI_abort () at abort.c:90
#2  0x00007ffff6c3fd37 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff6d51d58 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  0x00007ffff6c48499 in malloc_printerr (ar_ptr=0x7ffff6f8d760 <main_arena>, ptr=<optimized out>, str=0x7ffff6d51e40 "double free or corruption (top)", action=3)
    at malloc.c:5025
#4  _int_free (av=0x7ffff6f8d760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3847
#5  0x00007ffff499b78e in mdb_cursor_close (mc=0x226c1b0) at ../dependencies/lmdb/libraries/liblmdb/mdb.c:8550
#6  0x00007ffff49b1690 in CursorWrap::close (info=...) at ../src/cursor.cpp:94
#7  0x00007ffff49a3a83 in Nan::imp::FunctionCallbackWrapper (info=...) at ../node_modules/nan/nan_callbacks_12_inl.h:174
#8  0x0000000000a5ff63 in v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) ()
#9  0x0000000000ad710c in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) ()
#10 0x0000000000ad7d5f in v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) ()
#11 0x000008c5422842fd in ?? ()
#12 0x0000006600000000 in ?? ()
#13 0x000008c542284241 in ?? ()
#14 0x00007fffffffc1a0 in ?? ()
#15 0x0000000000000006 in ?? ()
#16 0x00007fffffffc290 in ?? ()
#17 0x000008c54233d1d6 in ?? ()
#18 0x00001f6cc1e822d1 in ?? ()
#19 0x0000049b6d104099 in ?? ()
#20 0x0000000400000000 in ?? ()
#21 0x00000baf842c2c11 in ?? ()
#22 0x00001f6cc1e822d1 in ?? ()
#23 0x00001f6cc1e822d1 in ?? ()
#24 0x00001f6cc1e822d1 in ?? ()
#25 0x00001f6cc1e822d1 in ?? ()
#26 0x00000baf842c2be1 in ?? ()
#27 0x000002c11475b111 in ?? ()
#28 0x00000baf842c2bc1 in ?? ()
#29 0x0000049b6d104099 in ?? ()
#30 0x0000000000000000 in ?? ()

nwrbs avatar Jul 08 '18 21:07 nwrbs

Thank you @nwbrad the relevant line seems to be CursorWrap::close (info=...) at ../src/cursor.cpp:94 which tells us that there is a double free somehow in there.

Can you give me a small javascript testcase, so that I can experiment with it?

Venemo avatar Jul 09 '18 07:07 Venemo

Below is the function in which the error occurs. All logic related to other things has been removed, so it just shows the LMDB logic I used. The issue appears to occur when I reuse the Cursor variable. If I create a second cursor variable and close the cursors before I commit the transactions, the issue does not appear to happen... I run the function on multiple tables and the issue only happens with one table that has more rows than my other tables. As a result, the issue may not be reproducible.

convertTable(env, tblDbi, tblName, dataType) { let dbiTmpTbl = this.envTmp.openDbi({ name: 'tmpTbl', keyIsBuffer: true, create: true });

let txnTmp = this.envTmp.beginTxn();
let txn = env.beginTxn();

let keyBufX = null;
let valBuf = null;
let valBufX = null;
let iCnt = 0;

let cursor = new lmdb.Cursor(txn, tblDbi, { keyIsBuffer: true });

for (let key = cursor.goToFirst(); key !== null; key = cursor.goToNext()) {
    iCnt++;
    try {
        valBuf = cursor.getCurrentBinary();

        if (dataType === 's'){
            txnTmp.putBinary(dbiTmpTbl, keyBufX, valBufX, { keyIsBuffer: true });
        } else {
            txnTmp.putBinary(dbiTmpTbl, keyBufX, valBuf, { keyIsBuffer: true });
        }
        txn.del(tblDbi, key);

    } catch (e) {
        console.log(e);
    }
}

txnTmp.commit();
txn.commit();
cursor.close();

txnTmp = this.envTmp.beginTxn();
txn = env.beginTxn();

cursor = new lmdb.Cursor(txnTmp, dbiTmpTbl, { keyIsBuffer: true });

for (let key = cursor.goToFirst(); key !== null; key = cursor.goToNext()) {
    try {
        valBuf = cursor.getCurrentBinary();
        if (dataType === 's'){
            txn.putBinary(tblDbi, key, valBuf, { keyIsBuffer: true });
        } else {
            txn.putBinary(tblDbi, key, valJSONBuf, { keyIsBuffer: true });
        }
    } catch (e) {
        console.log(e);
    }
}

txn.commit();
txnTmp.commit();
cursor.close();

dbiTmpTbl.drop(true);

nwrbs avatar Jul 09 '18 15:07 nwrbs

Not sure what would cause this but I'll investigate. Sorry for the slow reply.

Venemo avatar Jul 11 '18 19:07 Venemo

It does seem a bit odd how it occurs. Thanks for the help.

nwrbs avatar Jul 12 '18 15:07 nwrbs

@nwbrad Does it only occur with one specific database, though?

Venemo avatar Jul 13 '18 11:07 Venemo

Sort of. It is one specific Environment, I tried multiple copies of the Environment, restored from backup and tried older versions, all the same result. Nothing fancy about the Environment. It has databases that originally used Int Keys and String Keys to store either strings or numbers. The app that has the issue access LMDB through using binary/buffer access methods to convert utf16 string to utf8 strings and in some databases converts values from number values to utf8 strings. (Now all my programs use the binary/buffer process with LMDB. As I started using C and Java to access the same LMDB the utf16 and null terminator started causing issues. Plus I wanted to support 64 bit integers.)

The environment in question has 4 databases in it and seems to work fine other than this issue.
I'll call the database with the issue Database X the others as A , B and C respectively. If I run the conversion on just X, I have no problem.
If I run the conversion A, B and C have no problem. If I run the conversion on A, B, C and X then the problem occurs.

nwrbs avatar Jul 13 '18 15:07 nwrbs

@nwbrad Can you give me some version of this environment (possibly with some dummy data) where I can reproduce this problem?

Venemo avatar Jul 14 '18 06:07 Venemo

I won't have time this week to do it, but I will try the week after next.

nwrbs avatar Jul 16 '18 02:07 nwrbs

Thanks @nwbrad I'm looking forward to seeing it.

Venemo avatar Jul 19 '18 12:07 Venemo

@nwbrad Any news about this?

If you can give me a dataset and some code with which the issue is reproducible I am fairly sure I can fix the problem. However without that I would be just shooting in the dark.

Venemo avatar Aug 03 '18 11:08 Venemo

Sorry, I'm caught up in a few other things I need to get done. I probably won't be able to look at this for a couple of weeks. If you want to close it and I can reopen it when I have time to create data for it, that would be okay.

nwrbs avatar Aug 03 '18 15:08 nwrbs

@nwbrad No problem, we can keep this open for a few more weeks. In the meantime if any other user encounters the same or similar issue, they can chime in too.

Venemo avatar Aug 03 '18 16:08 Venemo