`Segmentation fault` when executing worker with `eval: true` containing dynamic import after `await`
Version
v16.15.0
Platform
Microsoft Windows NT 10.0.19044.0 x64 / WSL: Linux PC 4.4.0-19041-Microsoft #1237-Microsoft Sat Sep 11 14:32:00 PST 2021 x86_64 x86_64 x86_64 GNU/Linux
Subsystem
No response
What steps will reproduce the bug?
- Download https://gist.github.com/sapphi-red/a0179b52ebd3b6a19f51743594810ecf
- Run
node eval_cjs.mjs. It exits with non-0 exit code. a. On windows, it does not show any error messages. The output ofecho $LASTEXITCODEis-1073741819. b. On WSL (Ubuntu), it showsSegmentation fault (core dumped). The output ofecho $?is139. - Run
node eval_cjs_cjs.cjs. It exits with non-0 exit code same witheval_cjs.mjs. - Run
noneval_cjs.mjs. It finishes without any errors. - Run
noneval_mjs.mjs. It finishes without any errors.
The difference between the files are:
| filename | parent script | worker script | is eval | result |
|---|---|---|---|---|
eval_cjs.mjs |
ESM | CJS | O | fail |
eval_cjs_cjs.cjs |
CJS | CJS | O | fail |
noneval_cjs.mjs |
ESM | CJS | X | success |
noneval_mjs.mjs |
ESM | ESM | X | success |
How often does it reproduce? Is there a required condition?
It reproduces every time on my machine. const startAt = Date.now() + 10 * 1000 might differ between machines.
I suppose at least it needs to meet the following condition:
-
workeris created witheval: true - dynamic import is called after
await
I am not sure why it requires 10 seconds delay.
What is the expected behavior?
Segmentation fault not happening.
What do you see instead?
Segmentation fault is happening.
Additional information
context: I was tring to fix https://github.com/vitejs/vite/pull/8049#issuecomment-1135919637. (FailureMessage one is not related to this.)
Thanks for the report. I can also reproduce with v18.2.0. The stack trace looks like this:
(lldb) bt
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000100097270 node`node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>) + 608
frame #1: 0x00000001003afcb6 node`v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::Handle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) + 886
frame #2: 0x0000000100821ff5 node`v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) + 293
frame #3: 0x0000000100c11a34 node`Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit + 52
frame #4: 0x0000000100cb0d0e node`Builtins_CallRuntimeHandler + 78
frame #5: 0x0000000100b93c4f node`Builtins_InterpreterEntryTrampoline + 207
frame #6: 0x0000000100bc8e3f node`Builtins_AsyncFunctionAwaitResolveClosure + 63
frame #7: 0x0000000100c62d38 node`Builtins_PromiseFulfillReactionJob + 56
frame #8: 0x0000000100bba4c4 node`Builtins_RunMicrotasks + 644
frame #9: 0x0000000100b91c83 node`Builtins_JSRunMicrotasksEntry + 131
frame #10: 0x0000000100391f70 node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 2944
frame #11: 0x0000000100392533 node`v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 83
frame #12: 0x0000000100392721 node`v8::internal::Execution::TryRunMicrotasks(v8::internal::Isolate*, v8::internal::MicrotaskQueue*, v8::internal::MaybeHandle<v8::internal::Object>*) + 81
frame #13: 0x00000001003bda7b node`v8::internal::MicrotaskQueue::RunMicrotasks(v8::internal::Isolate*) + 315
frame #14: 0x00000001003be2b1 node`v8::internal::MicrotaskQueue::PerformCheckpoint(v8::Isolate*) + 97
frame #15: 0x0000000100001dfe node`node::InternalCallbackScope::Close() + 462
frame #16: 0x00000001000017de node`node::InternalCallbackScope::~InternalCallbackScope() + 14
frame #17: 0x000000010006c232 node`node::Environment::RunTimers(uv_timer_s*) + 578
frame #18: 0x0000000100b6f186 node`uv__run_timers + 54
frame #19: 0x0000000100b73b07 node`uv_run + 247
frame #20: 0x0000000100002f9f node`node::SpinEventLoop(node::Environment*) + 287
frame #21: 0x000000010017da98 node`node::worker::Worker::Run() + 2056
frame #22: 0x0000000100181472 node`node::worker::Worker::StartThread(v8::FunctionCallbackInfo<v8::Value> const&)::$_3::__invoke(void*) + 50
frame #23: 0x00007fff7e2722eb libsystem_pthread.dylib`_pthread_body + 126
frame #24: 0x00007fff7e275249 libsystem_pthread.dylib`_pthread_start + 66
frame #25: 0x00007fff7e27140d libsystem_pthread.dylib`thread_start + 13
From looking at the disassembly I expect the nullptr dereference is around here: https://github.com/nodejs/node/blob/1531ef1c0c30b35b764981683383187261c97779/src/module_wrap.cc#L585-L605
I have located the code location that triggers the null pointer exception: https://github.com/nodejs/node/compare/master...leizongmin:fix/issue-43205
->Uint32Value(context)
.ToChecked();
if (type == ScriptType::kScript) {
- contextify::ContextifyScript* wrap = env->id_to_script_map.find(id)->second;
- object = wrap->object();
+ auto it = env->id_to_script_map.find(id);
+ CHECK_NE(it, env->id_to_script_map.end());
+ object = it->second->object();
} else if (type == ScriptType::kModule) {
ModuleWrap* wrap = ModuleWrap::GetFromID(env, id);
object = wrap->object();
Below is the simplified code that reproduces the problem in versions v16.14.2 and v19.0.0-pre:
import { Worker } from "worker_threads";
const code = `
(async function () {
await new Promise((resolve) => setTimeout(resolve, 10 * 1000));
import('worker_threads');
})();
`;
const worker = new Worker(code, { eval: true });
After adding the null pointer check, the error message is like this:
node[5913]: ../../src/module_wrap.cc:595:v8::MaybeLocal<v8::Promise> node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>): Assertion `(it) != (env->id_to_script_map.end())' failed.
1: 0x55f5348bd284 node::Abort() [node]
2: 0x55f5348bd318 [node]
3: 0x55f53487286c [node]
4: 0x55f534c69a30 v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::Handle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) [node]
5: 0x55f53510aa86 v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) [node]
6: 0x55f535577eb4 [node]
I will continue to focus on this issue.
I found this issue is related to https://github.com/nodejs/node/issues/38695 and https://github.com/nodejs/node/issues/25424
Simpler reproducible code Check https://github.com/nodejs/node/issues/44211#issuecomment-1271137984 for the fix I came with :)
const vm = require('node:vm')
const code = `
new Promise(resolve => {
setTimeout(() => {
gc(); // gc the vm.Script instance, causing executing 'import' in microtask a segment fault
resolve();
}, 1);
}).then(() => import('http'));`
// under the hood, a vm.Script instance is created
vm.runInThisContext(code)
Related: https://github.com/nodejs/node/issues/47096
Another minimized test case from a different perspective is this: node-segfault.js:
let Module = module.constructor;
Module.wrap = Module.wrap; // put the loader into patched mode
let { dynamicImport } = require('./dynamic-import');
gc();
dynamicImport();
dynamic-import.js:
module.exports.dynamicImport = () => { return import ('./node-segfault.js'); } // dynamically import anything
Setting the Module.wrap puts the loader into "patched" mode, where it uses the vm Script/context object. This creates a ContextifyScript, which records the script ids in the id_to_script_map map, but is also weakly referenced, so the garbage collection causes the id/script to be removed from id_to_script_map when GC'ed, and so the offending line from ImportModuleDynamically attempts to access a script by id that is no longer there.
To help with anyone searching for the segfault isses, modifying/patching Module.wrap is done by some libraries, like the rewire package. And if you came here looking for a terrible and horrible hack to work around this seg fault, you can do this to prevent the GC-induced segfault:
const { Script } = require('vm');
let pinnedScripts = [];
let originalRun = Script.prototype.runInThisContext;
Script.prototype.runInThisContext = function (options) {
pinnedScripts.push(this);
return originalRun.call(this, options);
};
Making ContextifyScript not weak also eliminates this issue, but I assume scripts are supposed to be collectable.
I tested https://github.com/nodejs/node/pull/48510 on the repro in the OP, in https://github.com/nodejs/node/issues/43205#issuecomment-1272209939 and https://github.com/nodejs/node/issues/43205#issuecomment-1534663358 and they are all fixed locally. I'll write a test based on the repo in https://github.com/nodejs/node/issues/43205#issuecomment-1272209939 in the PR because it's simpler
To help with anyone searching for the segfault isses, modifying/patching
Module.wrapis done by some libraries, like the rewire package. And if you came here looking for a terrible and horrible hack to work around this seg fault, you can do this to prevent the GC-induced segfault:const { Script } = require('vm'); let pinnedScripts = []; let originalRun = Script.prototype.runInThisContext; Script.prototype.runInThisContext = function (options) { pinnedScripts.push(this); return originalRun.call(this, options); };Making
ContextifyScriptnot weak also eliminates this issue, but I assume scripts are supposed to be collectable.
I've been trying to use this workaround but I'm not getting it to stick. I have the exact same issue as here, but with typescript. Forgive me for being a total js/ts beginner, but can you give a wider example of where to put the workaround? Specifically, I have a jest test with a beforeAll. I tried it both in the top of the file as well as in the beforeAll and my jest test setup script (ts), but to no effect.
@etnoy You might try verifying/debugging that the pinnedScripts are indeed being appended. And I believe you do need to ensure that this is executed before you load the module that will execute a dynamic import. Sorry, I don't have any suggestions besides that.
@etnoy You might try verifying/debugging that the
pinnedScriptsare indeed being appended. And I believe you do need to ensure that this is executed before you load the module that will execute a dynamic import. Sorry, I don't have any suggestions besides that.
Thanks, will do!
Closing as https://github.com/nodejs/node/pull/48510 has landed and should fix this. We can re-open if that turns out to be incorrect (hopefully not).