-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: add support for externally shared js builtins #44376
Changes from all commits
b7d4bcd
5428aa2
5dd4c99
6f6d80e
a614c7b
eb708c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# Maintaining Dependencies | ||
|
||
Node.js depends on additional components beyond the Node.js code | ||
itself. These dependencies provide both native and JavaScript code | ||
and are built together with the code under the `src` and `lib` | ||
directories to create the Node.js binaries. | ||
|
||
All dependencies are located within the `deps` directory. | ||
|
||
Any code which meets one or more of these conditions should | ||
be managed as a dependency: | ||
|
||
* originates in an upstream project and is maintained | ||
in that upstream project. | ||
* is not built from the `preferred form of the work for | ||
making modifications to it` (see | ||
[GNU GPL v2, section 3.](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html) | ||
when `make node` is run. A good example is | ||
WASM code generated from C (the preferred form). | ||
Typically generation is only supported on a subset of platforms, needs | ||
additional tools, and is pre-built outside of the `make node` | ||
step and then committed as a WASM binary in the directory | ||
for the dependency under the `deps` directory. | ||
|
||
By default all dependencies are bundled into the Node.js | ||
binary, however, `configure` options should be available to | ||
use an externalized version at runtime when: | ||
|
||
* the dependency provides native code and is available as | ||
a shared library in one or more of the common Node.js | ||
distributions. | ||
* the dependency provides JavaScript and is not built | ||
from the `preferred form of the work for making modifications | ||
to it` when `make node` is run. | ||
|
||
Many distributions use externalized dependencies for one or | ||
more of these reasons: | ||
|
||
1. They have a requirement to build everything that they ship | ||
from the `preferred form of the work for making | ||
modifications to it`. This means that they need to | ||
replace any pre-built components (for example WASM | ||
binaries) with an equivalent that they have built. | ||
2. They manage the dependency separately as it is used by | ||
more applications than just Node.js. Linking against | ||
a shared library allows them to manage updates and | ||
CVE fixes against the library instead of having to | ||
patch all of the individual applications. | ||
3. They have a system wide configuration for the | ||
dependency that all applications should respect. | ||
|
||
## Supporting externalized dependencies with native code. | ||
|
||
Support for externalized dependencies with native code for which a | ||
shared library is available can added by: | ||
|
||
* adding options to `configure.py`. These are added to the | ||
shared\_optgroup and include an options to: | ||
* enable use of a shared library | ||
* set the name of the shared library | ||
* set the path to the directory with the includes for the | ||
shared library | ||
* set the path to where to find the shared library at | ||
runtime | ||
* add a call to configure\_library() to `configure.py` for the | ||
library at the end of list of existing configure\_library() calls. | ||
If there are additional libraries that are required it is | ||
possible to list more than one with the `pkgname` option. | ||
* in `node.gypi` guard the build for the dependency | ||
with `node_shared_depname` so that is is only built if | ||
the dependency is being bundled into Node.js itself. For example: | ||
|
||
```text | ||
[ 'node_shared_brotli=="false"', { | ||
'dependencies': [ 'deps/brotli/brotli.gyp:brotli' ], | ||
}], | ||
``` | ||
|
||
## Supporting externalizable dependencies with JavaScript codeIZA | ||
|
||
Support for an externalizable dependency with JavaScript code | ||
can be added by: | ||
|
||
* adding an entry to the `sharable_builtins` map in | ||
`configure.py`. The path should correspond to the file | ||
within the deps directory that is normally bundled into | ||
Node.js. For example `deps/cjs-module-lexer/lexer.js`. | ||
This will add a new option for building with that dependency | ||
externalized. After adding the entry you can see | ||
the new option by running `./configure --help`. | ||
|
||
* adding a call to `AddExternalizedBuiltin` to the constructor | ||
for BuildinLoader in `src/node_builtins.cc` for the | ||
dependency using the `NODE_SHARED_BUILTLIN` #define generated for | ||
the dependency. After running `./configure` with the new | ||
option you can find the #define in `config.gypi`. You can cut and | ||
paste one of the existing entries and then update to match the | ||
inport name for the dependency and the #define generated. | ||
|
||
## Supporting non-externalized dependencies with JavaScript code | ||
|
||
If the dependency consists of JavaScript in the | ||
`preferred form of the work for making modifications to it`, it | ||
can be added as a non-externalizable dependency. In this case | ||
simply add the path to the JavaScript file in the `deps_files` | ||
list in the `node.gyp` file. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,6 +32,24 @@ BuiltinLoader BuiltinLoader::instance_; | |
|
||
BuiltinLoader::BuiltinLoader() : config_(GetConfig()), has_code_cache_(false) { | ||
LoadJavaScriptSource(); | ||
#if defined(NODE_HAVE_I18N_SUPPORT) | ||
#ifdef NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_LEXER_PATH | ||
AddExternalizedBuiltin( | ||
"internal/deps/cjs-module-lexer/lexer", | ||
STRINGIFY(NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_LEXER_PATH)); | ||
#endif // NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_LEXER_PATH | ||
|
||
#ifdef NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_DIST_LEXER_PATH | ||
AddExternalizedBuiltin( | ||
"internal/deps/cjs-module-lexer/dist/lexer", | ||
STRINGIFY(NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_DIST_LEXER_PATH)); | ||
#endif // NODE_SHARED_BUILTIN_CJS_MODULE_LEXER_DIST_LEXER_PATH | ||
|
||
#ifdef NODE_SHARED_BUILTIN_UNDICI_UNDICI_PATH | ||
AddExternalizedBuiltin("internal/deps/undici/undici", | ||
STRINGIFY(NODE_SHARED_BUILTIN_UNDICI_UNDICI_PATH)); | ||
#endif // NODE_SHARED_BUILTIN_UNDICI_UNDICI_PATH | ||
#endif // NODE_HAVE_I18N_SUPPORT | ||
} | ||
|
||
BuiltinLoader* BuiltinLoader::GetInstance() { | ||
|
@@ -219,6 +237,29 @@ MaybeLocal<String> BuiltinLoader::LoadBuiltinSource(Isolate* isolate, | |
#endif // NODE_BUILTIN_MODULES_PATH | ||
} | ||
|
||
#if defined(NODE_HAVE_I18N_SUPPORT) | ||
void BuiltinLoader::AddExternalizedBuiltin(const char* id, | ||
const char* filename) { | ||
std::string source; | ||
int r = ReadFileSync(&source, filename); | ||
if (r != 0) { | ||
fprintf( | ||
stderr, "Cannot load externalized builtin: \"%s:%s\".\n", id, filename); | ||
ABORT(); | ||
return; | ||
} | ||
|
||
icu::UnicodeString utf16 = icu::UnicodeString::fromUTF8( | ||
icu::StringPiece(source.data(), source.length())); | ||
auto source_utf16 = std::make_unique<icu::UnicodeString>(utf16); | ||
Comment on lines
+252
to
+254
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This probably needs some sort of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @addaleax I was wondering about that. If we don't have intl, do you know how typically convert to utf16 if intl is not enabled ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #37954 (comment) has some prior discussion on that (tl;dr: we don’t, we can use V8 in cases in which an Isolate is available but not here) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the pointer to the discussion, I can see that is trying to the code where I'd grabbed using icu for the conversion from :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ended up making it so that the --without-intl reports an error if you try to configure it along with an externally shared js builtin and then guarding the code so that if --without-intl is used without the new options it is excluded. My take is that it is acceptable for the new options to not be availble when --without-intl. My reasoning is both that people using --without-intl is a small set of people at this point in time and that those that do are unlikely to be interested in externalizing builtins and that rolling our own icu convertion for this unlikely case is not worth the work/additional code in Node.js. If that's wrong or we add native conversion helpers at some later time we can always do the work to remove that restriction. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also did confirm as Anna mentioned that an isolate is not available in the method where we do the coversions so using V8 is not possible in this case. |
||
Add(id, | ||
UnionBytes(reinterpret_cast<const uint16_t*>((*source_utf16).getBuffer()), | ||
utf16.length())); | ||
// keep source bytes for builtin alive while BuiltinLoader exists | ||
GetInstance()->externalized_source_bytes_.push_back(std::move(source_utf16)); | ||
} | ||
#endif // NODE_HAVE_I18N_SUPPORT | ||
|
||
// Returns Local<Function> of the compiled module if return_code_cache | ||
// is false (we are only compiling the function). | ||
// Otherwise return a Local<Object> containing the cache. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this means that distros can now provide different versions of these dependencies. Are we okay with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this as very similar to supporting external shared libraries and it is motivated by a similar use case. Since it does not affect what we ship I think it's up to the distros that use it to keep the versions the same or compatible in what they ship just like they do for shared libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to carefully weigh this. I see the motivation but not quite convinced yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell can you outline your concerns in a bit more detail? We have 2 distro's asking for the functionality in order to avoid having to hack in their own solutions?