module: implement synchronous module evaluate hooks
This patch adds an evaluate hook that can be used as a supported alternative to monkey-patching the CJS module loader, when the primary use case is to modify the exports of modules, with the goal of reducing dependency on CJS loader monkey-patching in the wild.
// mod.js
exports.prop = 'original';
// hook.js
require('module').registerHooks({
evaluate(context, nextEvaluate) {
context.module.exports; // Access the exports before evaluation.
nextEvaluate(); // Invokes default evaluation. Can be skipped.
context.module.exports.prop; // 'original'
context.module.exports.prop = 'updated'; // Update exports.
},
});
require('./mod.js'); // { prop: 'updated' }
The evaluate hook is run after the resolve and load hook, abstracting the final execution of the code in the module. It is only available to module.registerHooks. It is currently only run for the execution of the following modules:
- CommonJS modules, either being
required orimported. In this case,context.moduleequals to themoduleobject in the CommonJS module being evaluated, andcontext.module.exportsis mutable. - ECMAScript modules that are
required. This hook would only be run for the evaluation of the module being directlyrequired, but could not be run for each of its inner modules beingimported. In this case,context.moduleis aModulewrapper around the ECMAScript modules. Reassigningcontext.module.exportsto something else only affects the result ofrequire()call, but would not affect access within the ECMAScript module. Properties ofcontext.module.exports(exports of the ECMAScript module) are not mutable.
In future versions it may cover more module types, but the following are unlikely to be supported due to restrictions in the ECMAScript specifications:
- The ability to skip evaluation of an inner ECMAScript module being
imported by ECMAScript modules. - The ability to mutate the exports of a ECMAScript module.
For the ability to customize execution and exports of all the ECMAScript modules in the graph, consider patching the source code of the ECMAScript modules using the load hook as an imperfect workaround.
Refs: https://github.com/nodejs/node/issues/56241
In the test I included a POC for building https://www.npmjs.com/package/require-in-the-middle on top of it instead of monkey-patching. Support for (non-mutable) ESM evaluation depends on https://issues.chromium.org/u/1/issues/384413088 - mutability of ESM exports is off the table for V8 without spec support, so it's a non-goal/out of scope for this hook, and this mostly focus on at least sunsetting CJS monkey-patching.
Review requested:
- [ ] @nodejs/loaders
CI: https://ci.nodejs.org/job/node-test-pull-request/65323/
Given this is CJS-only I think we should clearly call the hook cjsEvaluate or similar.
Codecov Report
:x: Patch coverage is 96.75926% with 7 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 89.03%. Comparing base (baa60ce) to head (9a7985c).
:warning: Report is 1311 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| lib/internal/modules/esm/translators.js | 73.07% | 7 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #57139 +/- ##
==========================================
- Coverage 89.03% 89.03% -0.01%
==========================================
Files 665 665
Lines 193408 193595 +187
Branches 37279 37305 +26
==========================================
+ Hits 172206 172369 +163
- Misses 13886 13893 +7
- Partials 7316 7333 +17
| Files with missing lines | Coverage Ξ | |
|---|---|---|
| lib/internal/errors.js | 97.46% <100.00%> (+<0.01%) |
:arrow_up: |
| lib/internal/modules/cjs/loader.js | 98.15% <100.00%> (-0.12%) |
:arrow_down: |
| lib/internal/modules/customization_hooks.js | 100.00% <100.00%> (ΓΈ) |
|
| lib/internal/modules/esm/translators.js | 90.59% <73.07%> (-0.87%) |
:arrow_down: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Given this is CJS-only I think we should clearly call the hook cjsEvaluate or similar.
The idea is that we'll have non-mutable support for ESM (e.g. for tracing perf or diagnostics) when we have https://issues.chromium.org/issues/384413088, not that it will never support ESM. Also I am not sure if you have noticed, but this supports require(esm) as well as import cjs. It will also be able to support import json, import wasm and import addon. The only thing it cannot support with just Node.js changes is import esm, which requires a V8 change, which is still possible, just that the hook has to pretend that ESM has a frozen module.exports.
I see that it supports require(esm), and that is via CJS wrapper mutation not an ESM mutation right? In that case it would still be CJS-specific, but which all works well IMO.
My feedback is just that a specific name might be useful to make the distinction clear to end users that this hook only applies to CJS and ESM CJS wrappers.
this hook only applies to CJS and ESM CJS wrappers.
I am not sure why this is not clear from OP or from the comment above, I guess I will have to repeat again, this will be able to support
- import ESM, with future V8 changes, without mutation support
- import json
- import wasm
- import addon
It's just not done in this PR, and left as TODOs. Does it make sense to ask people to use a hook named cjsEvaluate to customize the edges above?
This would not apply to JSON, Wasm and Addons imported into ES modules though right?
Does it make sense to ask people to use a hook named cjsEvaluate to customize the edges above?
I did have the same thought as Guy, that perhaps people would be confused about what this applies to, but I agree that if it's not CommonJS-specific then we shouldn't name it cjsEvaluate. I considered syncEvaluate but that seems confusing on its own since there presumably won't be an asyncEvaluate.
If the purpose of this is to modify the exports, we could potentially call it exports, to emphasize that part; and then the docs could explain that ESM exports aren't editable so this hook can only read but not change them. If it allows for more than just modifying the exports but also changing the contents of the module itself, then I guess evaluate is probably the best name.
This would not apply to JSON, Wasm and Addons imported into ES modules though right?
It would. They are actually easier to cover because they are mutable (since synthetic module exports are mutable externally) and there is already a evaluation callback phase, and I already left TODO in this PR where the hook can be implemented for these. Only import SourceTextModule need a V8 change and would have a bit more restriction (exports won't be directly mutable. But the hook can still be invoked, and at least can modify deeper properies of the exports..).
If you absolutely have to see it I can e.g. implement the hook for import addons in this PR and demonstrate how to skip addons initialization when they are imported into a source text module and point the exports to something else (or I can think of some interesting use case like toggling Linux perf before and after the initialization of an addon or the invocation of its method to trace syscalls made by it). But I think it should be rather obvious in the TODO I left in the code here how it can be implemented without actually implementing it to prove that it's implementable (and I mostly do not want to implement it here because I want to consolidate the addon evaluation in ESM and CJS first for better reuse instead of repeating that default implementation at least 2 more times in the code..)
If it allows for more than just modifying the exports but also changing the contents of the module itself, then I guess evaluate is probably the best name.
It will also allow skipping default evaluation entirely for most modules except source text modules (synthetic module evaluation, or wrapper evaluation like imported WASM, is still controlled by Node.js and skippable). Just that the primary motivation of this for existing use case is still modifying exports. But I can see it being used for e.g. implementing stubs, or tracing evaluation of modules for diagnostics.
It will also allow skipping default evaluation entirely for most modules except source text modules
What if the ultimate way source text modules are supported does not align with this hook design? Do we then need an evaluteV2 hook?
What if the ultimate way source text modules are supported does not align with this hook design? Do we then need an evaluteV2 hook?
I think the default evaluate for source text module can just be a no-op, or throw an error, or just be undefined, with some additional context parameter signaling this case and a documentation note about the limitation. In any case, it's going to be experimental, if we do not version all the other experimental APIs like this when we change them, I don't see why this has to be an exception? (we didn't even rename the hooks when they were moved off-thread, whatever changes made for ESM here can't be any more drastic than that..)
My honest opinion here is I don't think there is a future where this hook can work for source text modules, therefore I would prefer an API that takes that into account from the start. Whether or not you agree with me, if I am right, then it seems strange to me to have a hook called "evaluate" that will never apply to source text modules.
My honest opinion here is I don't think there is a future where this hook can work for source text modules, therefore I would prefer an API that takes that into account from the start. Whether or not you agree with me, if I am right, then it seems strange to me to have a hook called "evaluate" that will never apply to source text modules.
We do have resolve and load hooks that do not work for e.g. CJS loaded by createRequire(), or CJS loaded by a CJS entry point. Would you rather rename them? If so, what would it be?
I think at worst this could still be a post-evaluate hook for source text modules (that does not even strictly require a V8 change...but it would be less hacky if it is). IMO evaluate is the best name I can think of for a hook that can at least customize
- require(anything)
- import any-thing-but-source-text-module
If you think evaluate does not make sense, do you have a better suggestion? Do you insist that we should use cjsEvaluate to customize e.g. import addon from a completely ESM graph?
If you think evaluate does not make sense, do you have a better suggestion? Do you insist that we should use cjsEvaluate to customize e.g. import addon from a completely ESM graph?
Since it is operating on a module.exports object which is a plain object and not a ModuleNamespace, I think moduleExportsHook might be another alternative.
I also have concerns about it applying for JSON and Wasm in that it would act on a CJS-style exports object as a proxy for these in the dynamic module instead of the underlying native module namespace.
Since it is operating on a module.exports object which is a plain object and not a ModuleNamespace, I think moduleExportsHook might be another alternative.
I don't think it will be a module.exports for other cases - the more likely case is a vm.Module wrapper, or something similar, that allows more powerful manipulation of the underlying v8::Module instance. It would be similar to e.g. importModuleDynamically where the referrer can be of multiple types depending on where it comes from.
Right, but until it's clear there's a path forward for such a construct, perhaps let's leave the design space open for now, to avoid the future breaking change that would be needed, even though the API is experimental.
Since we are operating on module.exports today and not ModuleInstance.
perhaps let's leave the design space open for now, to avoid the future breaking change that would be needed, even though the API is experimental.
I am not sure what you mean by leaving the design space open - do you mean not adding an evaluate hook, and prefer to let user land continue monkey patching the CJS loader instead?
I wonder if we can have a compromise of naming the CJS module wrapper context.cjsModule. Then users would just check if (context.cjsModule) and handle it if it exists instead. When we have support for the others, it will be put on a different key (perhaps context.esModule), and of a different type (probably just the existing vm.Module subclasses, though I suspect they need some refactoring to be able to wrap around an built-in module). We might even be able to put both on the context in the case of import cjs or require(esm), each corresponding to the entry in the CJS and ESM cache.
I think that something like you describe might be closer to the right direction here. Properly plotting what the path looks like for a more general hook. But then what about a require(esm) or a import(cjs) - would these have both a cjsModule and an esModule context value? Or would the wrapper dynamically adjust on either side to mutations of the original wrapper type?
My original feedback was very simple - let's name this something specific to the CommonJS module.exports mutation it is designed to handle. What specifically is your concern with exportsHook or cjsEvaluate or exportsEvaluate or something similar? If you want to go the more general path I think more discussion would be needed here.
But then what about a require(esm) or a import(cjs) - would these have both a cjsModule and an esModule context value? Or would the wrapper dynamically adjust on either side to mutations of the original wrapper type?
Yes.
What specifically is your concern with exportsHook or cjsEvaluate or exportsEvaluate or something similar?
I think none of these are better than evaluate if it's powerful enough to handle at least customization and evaluation flow of
- require(anything)
- import anything-but-source-text-module
Even for imported source text module, we can at least invoke it post-evaluation. I don't think it's a much of a problem to just document that for imported source text modules, default evaluation cannot be controlled, or just add different hooks for imported source text modules, and say that it's an exception. It seems rather unfortunate to name the hook based on one exception, and is not consistent with how we name the async load/resolve hooks even though they also have their own exceptions.
so long as the design does not incorporate support for source text module.
The design does incorporate support for source text modules, for example, this would be possible:
evaluate(context, nextEvaluate) {
if (context.esModule) {
if (context.esModule.isSourceTextModule)
if (context.esModule.status === 'evaluated') { // It's already evaluated at this point, because it's imported
nextEvaluate(); // This just invokes other evaluate hooks, which also need to handle this case.
context.esModule.namespace.test = 1; // This does not work, because STM exports are frozen.
context.esModule.namespace.test.prop = 2; // This does work, if user doesn't freeze it
} else { // context.esModule.status === 'evaluating', it's being required for the first time
nextEvaluate(); // This may invoke default evaluation - can be skipped.
context.esModule.namespace.test.prop = 2;
}
}
if (!context.esModule.isSourceTextModule) {
context.esModule.setExport('test', 1); // This works, because it's a synthetic module.
}
}
}
The design does incorporate support for source text modules, for example, this would be possible:
This hook has two properties: (1) monkey patching existing instances, and (2) intercepting the execution of ESM modules down the graph including this nextEvaluate interception so that modules are all only exposed to all importers after this hook has run. (1) I can see being possible, but (2) is currently not spec compatible due to it breaking spec execution invariants. I think we would need to have more standards-focused design here before it is clear there is a standards-compatible path forward.
@joyeecheung I'd be interested to hear further about why the evaluate name is so important to you? It seems not such a big change to make to me to consider alternative names.
Will be great to follow up in the loaders meeting.
- is currently not spec compatible due to it breaking spec execution invariants.
I am not sure if you have read the comments, but it's spec-complaint - when imported source text modules arrive in this hook, it's status will always already be 'evaluated'. In this case, nextEvaluate only invokes other user evaluate hooks which also need to handle it, and by the end of the chain, nextEvaluate would be a no-op. It would not allow skipping of the default evaluation.
I'd be interested to hear further about why the evaluate name is so important to you? It seems not such a big change to make to me to consider alternative names.
I think this is the most intuitive name, while others are all more confusing in their own ways. As said before this is not a CJS-only hook, so using cjs is misleading. This hook would also be useful for tracing and diagnostics, and not just for modifying exports (not all modules have exports in the first place - what happens if one just wants to trace a script?), so using exports is also misleading. Another possibility I do not want to rule out is to just say that evaluate() is not supported for imported SourceTextModule, but they are supported through additional hooks like breforeEvaluate() and afterEvaluate(). Other types of modules just support all of the three hooks. And all the other options with before or after prefixed are just even more confusing.
Will be great to follow up in the loaders meeting.
I won't be able to go to the loaders meeting next week, the other one in 3 weeks might work.
I am not sure if you have read the comments, but it's spec-complaint - when imported source text modules arrive in this hook, it's status will always already be 'evaluated'. In this case, nextEvaluate only invokes other user evaluate hooks which also need to handle it, and by the end of the chain, nextEvaluate would be a no-op. It would not allow skipping of the default evaluation.
The problem here is a.mjs importing b.mjs where we currently only run the hook for a.mjs's CJS wrapper module and not the native a.mjs or b.mjs at all. In the model where it runs for all modules, that would not be spec compliant as it would involve execution injection between successive dependency module sources which is not currently something permitted in the spec for evaluation. So the hook as designed to be planned to generalize in future simply isn't spec compatible.
So there's a lot more engagement needed to figure out the general spec compatible path forward here. I don't want to slow things down though which is why I suggested a rename so this can land immediately and then we can follow up with discussions. But if you want to go through the discussions first and let this sit for a while I'm happy to do that too.
The problem here is a.mjs importing b.mjs where we currently only run the hook for a.mjs's CJS wrapper module and not the native a.mjs or b.mjs at all.
If we just say that evaluate does not support imported SourceTextModule, then if a.mjs is loaded by import, no hook would be run for a.mjs. If a.mjs is loaded by require, then as documented it only runs for what should be returned by require, and require is not part of the spec. You can already skip the evaluation of require("./a.mjs") today by monkey patching the CJS loader.
So the hook as designed to be planned to generalize in future simply isn't spec compatible.
This hook is not planned to support skipping evaluation of imported source text modules, as already mentioned in the OP and in the documentation added in the PR, and several times in the comments abbove.
If we just say that evaluate does not support imported SourceTextModule, then if a.mjs is loaded by import, no hook would be run for a.mjs. If a.mjs is loaded by require, then as documented it only runs for what should be returned by require, and require is not part of the spec. You can already skip the evaluation of require("./a.mjs") today by monkey patching the CJS loader.
An evaluation hook that does not apply at all in applications that only use ES modules is not an evaluation hook, so should have a different name.
I suggested a rename so this can land immediately and then we can follow up with discussions.
I would prefer landing it with an intuitive name instead of landing it with a confusing name, only to rename later. With this design it's still possible to extend support in an additive manner, for example, adding beforeEvaluate and afterEvaluate, or in the worst case, just say that imported SourceTextModule cannot be supported due to spec limitations.
An evaluation hook that does not apply at all in applications that only use ES modules is not an evaluation hook, so should have a different name.
Prior to module.registerHooks, the resolve and load hook were not applied at all in applications that only use CJS modules - to this day, they are still not applied at all for these applications if registered via module.register(), or if they are async. I personally don't think this is a problem, and I think this pattern is very universal in the Node.js APIs - for example many APIs do not support Windows and would not work at all when the application is run on Windows. That does not mean they have to be prefixed with POSIX. Rather, naming an API based on what it currently supports instead of what it's meant for is inconsistent with Node.js API naming conventions, even if the API will probably never work in certain cases.
@joyeecheung I really think something like evaluateExports is absolutely fine as a name, but if you truly are unwilling to budge, then we will need to escalate through the necessary Node.js process again unfortunately, which doesn't make contributing to the project particularly fun for either of us. But I have to act according to what I believe and will continue to do that.