-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node 20 segfaults when our loader runs #50280
Comments
Can you run it with |
not sure this is helpful, but if you'd like me to set any particular breakpoints and check values, let me know.
|
Looks like memory corruption at first sight; it's crashing in the glibc function that cleans up thread-local storage. Can you run node under valgrind and see what it reports? |
sorry this took a while (this is with the debug version of node 20.8.0). this is the terminal output:
|
should i rerun this with any of the following options? --error-limit=no |
The log file didn't provide additional clues, I'm afraid. Do you load any native modules (files with a .node suffix)? |
yes on .node files, but they have been used in production and tested with valgrind independently, so i don't suspect them. still, worth a double check. |
Can you still reproduce the crash when you exclude all .node files? |
thank you! i have isolated the crash to one file and am going to work on isolating it further. i will update this. |
more to come. |
OK, excluding all the .node files prevents the crash. If you'd like, I can close this issue and open a new issue - I'm at a loss on how to handle .node files that are being handled by an esm before node 20, --loader/--experimental-loader worked as follows:
using --loader with node 20.8.1, there are situations where we're loading a module-based user-application, e.g., test-app.mjs, that ends up requiring a executing: user-app.mjs: import { default as w } from './user-w.js';
console.log('w', w);
setTimeout(() => {
console.log('done');
}, 1000); w.js: const x = require('./x.js');
module.exports = x; x.js: module.exports = require('@contrast/agent-lib'); result:
The same error occurs when using --import ( Converting user-app to a .mjs file: user-app.mjs: import { default as w } from './w.js';
console.log('w', w);
setTimeout(() => {
console.log('done');
}, 1000); results in the same error. In all cases, it's throwing because the code (in both the exported or register()'d hook) returns if (format !== 'commonjs' && format !== 'module') {
console.debug({ load: url, format }, 'Skipping rewrite; loading original code');
return nextLoad(url, context);
} My question is how are we supposed to handle .node extensions? I've tried returning
Changing the return to
And, as a last gasp, I try creating a require in the loader: if (format !== 'commonjs' && format !== 'module') {
console.debug({ load: url, format }, 'Skipping rewrite; loading original code');
// new hack start
if (path.extname(filename) === '.node') {
const req = createRequire(import.meta.url);
const m = req(filename);
return { format, source: m, shortCircuit: true };
}
// new hack end
return nextLoad(url, context);
} which results in a segfault:
If I change to use a different .node file, it doesn't segfault, so I have something to look into there. But it does still give the ERR_UNKNOWN_MODULE_FORMAT, so I see no way to load a .node file when using either --loader or --import with node 20. There is an additional issue (should I create another issue?) with Sorry for such a dump here - I know that what we're doing is not mainstream, but we have no alternatives. |
^^^ if all that isn't clear (and I've been buried in this, so I might unintentionally omit some things), I can clean up my test programs and create a repo that contains repros for each. |
Thanks for taking a look. I'm working on making a relatively clean repo that allows for easy repros. In the meantime, maybe this helps? my task
I have not used a resolve hook (and honestly, I've had a hard time seeing how they would help me, so maybe they're just a big hole in my knowledge). Here's a quick attempt at clarifying things: Our application is the loader hook, specified after either --import or --loader. An arbitrary user application (and all that it requires) is the target for our rewriting. So If the user-application is commonjs, we specific that the command line is I do now know which .node files might be loaded by the user-application. The specific situation that I tried to document above was: user-application.mjs is loaded
that .node file appears in the load hook. |
@bnoordhuis is it proper initial setup - /~https://github.com/koshic/addon-test and |
repo: /~https://github.com/bmacnaughton/patch-o-matic
|
Which command should I use? And what is expected result? |
They each demonstrate different results. I've added them to the README. I'm sorry I didn't do that proactively. To demonstrate that the --loader hook sees the .node file even when it's required, run And you can run |
We don't accept bug reports for native modules unless it can be conclusively shown it's a bug in node itself (because, quite literally, 99% of bug reports about native modules are bugs in the module itself, not node.) |
Sadly, it's a Node issue, related to CJS loader used from ESM loader via hooks. I found easy way to reproduce it - /~https://github.com/koshic/addon-test. 'addon.node' can be any valid native module, it used only to demonstrate behavior difference. At first, we have 1.cjs file which loads 'addon.node' twice - with relative path, and with absolute path: // 1.cjs
console.log(require("./addon.node"));
console.log(require(require('path').resolve("./addon.node"))); It works via Next, let's use load hook returns source with relative path require: // hook-relative.mjs
export const load = async (url, context, nextLoad) => {
if (context.format === 'commonjs') {
return {
source: `console.log(require("./addon.node"));`,
format: "commonjs",
shortCircuit: true,
}
}
return nextLoad(url, context);
}
Last try - to use hook with absolute path in require: // hook-absolute.mjs
export const load = async (url, context, nextLoad) => {
if (context.format === 'commonjs') {
return {
source: `require(require('path').resolve("./addon.node"))`,
format: "commonjs",
shortCircuit: true,
}
}
return nextLoad(url, context);
} And we see an error: @koshic ➜ /workspaces/addon-test (main) $ node --loader=./hook-absolute.mjs 1.cjs
(node:20147) ExperimentalWarning: `--experimental-loader` may be removed in the future; instead use `register()`:
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("./hook-absolute.mjs", pathToFileURL("./"));'
(Use `node --trace-warnings ...` to show where the warning was created)
node:internal/process/esm_loader:40
internalBinding('errors').triggerUncaughtException(
^
TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".node" for /workspaces/addon-test/addon.node
at new NodeError (node:internal/errors:406:5)
at Object.getFileProtocolModuleFormat [as file:] (node:internal/modules/esm/get_format:99:9)
at defaultGetFormat (node:internal/modules/esm/get_format:142:36)
at defaultLoad (node:internal/modules/esm/load:120:20)
at nextLoad (node:internal/modules/esm/hooks:833:28)
at load (file:///workspaces/addon-test/hook-absolute.mjs:10:12)
at nextLoad (node:internal/modules/esm/hooks:833:28)
at Hooks.load (node:internal/modules/esm/hooks:416:26)
at handleMessage (node:internal/modules/esm/worker:168:24)
at checkForMessages (node:internal/modules/esm/worker:117:28) {
code: 'ERR_UNKNOWN_FILE_EXTENSION'
}
Node.js v20.8.0
@koshic ➜ /workspaces/addon-test (main) $ There is not segfault, but I think it depends on the particular setup with deeper import / require chains. |
@koshic. Your observation makes sense - both addons return absolute paths (only their index.js files are referenced using relative paths - each has code to select the right binary for the architecture that returns an absolute path). Thank you for isolating the issue. @bnoordhuis - I'm sorry this got so long; it was an interactive exchange; I had some trouble making the context clear. There are two issues here:
I am not ready to say that the .node file that causes the crash is absolutely node's problem, but we've been using it since node 14, run it with valgrind, and have an extensive test suite, so it's definitely a possibility that node 20 introduced an issue. (It's also a possibility that node 20 unmasked a latent issue.) Today I'm going to work with a debug version of node to see if I can't narrow down the issue a bit. I'm not at all familiar with all the new work for loaders, so it's slow going. |
@nodejs/loaders can you confirm whether #50280 (comment) is because of the removal of |
@bnoordhuis - thank you. You're much quicker than me; I just found that it gets down to The segfault is the result of one of my attempts to make if (format !== 'commonjs' && format !== 'module') {
console.debug({ load: url, format }, 'Skipping rewrite; loading original code');
// new hack start
if (path.extname(filename) === '.node') {
const req = createRequire(import.meta.url);
const m = req(filename);
return { format, source: m, shortCircuit: true };
}
// new hack end
return nextLoad(url, context);
} The code doesn't solve the problem, but the one .node module causes a segfault while the other does not. I'm going to continue looking at it to see if I can isolate it, but I'm guessing that it's in the C++ code and my progress will be slow. I'm also guessing some code isn't expecting that format of binary data. |
I doubt it. Can the issue be reproduced with one of the With regard to the segfaulting I’m having trouble following everything in this thread, but there might be an issue here in terms of Note that in Node 21 |
@GeoffreyBooth I've already created minimal repro - /~https://github.com/koshic/addon-test. Just clone it (or use GitHub codespaces) and run I can remove .node file from the example if you want, because error is related to resolver, not loader. |
The segfault is probably related to loading the addon from the loader thread, or from both loader and main threads. |
@GeoffreyBooth Thank you for the thoughts. Yes, the thread is a mess largely because of me. This started with one issue that I couldn't reproduce, and turned into multiple other issues. globals that don't exist in another threadYes, that was what appeared to be the problem initially - it was a bit of a surprise to me but I have been able to work around that by using either the segfaultingRequiring that specific .node file works fine when using require() that is handled by
This was just an abortive attempt of mine trying to find a way to get the esm hook to handle the .node file. minimal repro@koshic's repro is mininal. As he's noted, it's happening before the module is loaded - it's the result of Again, thanks for trying to follow this very messy and unfocused thread. The nature of the problem changed as I dug into it. There is still a potential segfault with code within the context of |
A minor issue I also ran into, I attempted to return |
You need to define a |
I don't follow how that would work. The resolve hook can look for any .node file, but what should it do with them? I think I'm missing a fundamental concept on how resolve hooks are intended to be used. |
Why? That code works fine without resolve hook, via export const load = async (url, context, nextLoad) => {
if (context.format === 'commonjs') {
return {
source: `console.log(require("./addon.node"));`,
format: "commonjs",
shortCircuit: true,
}
}
return nextLoad(url, context);
} Node resolver should work consistently, regardless of 'resolve' hook presence:
What we see - require('./addon.node') works, require('/workspaces/addon-test/addon.node') doesn't. |
In ESM Node throws on unknown or unsupported extensions. That’s why in https://nodejs.org/api/module.html#transpilation there’s a This “minimal” example is somewhat convoluted, in that a loader is returning source that contains I don’t think we currently support |
@GeoffreyBooth, ok, let's simplify setup. Prerequsite:
Case 1 - 'load' hook returns CJS source cat > hook1.mjs << EOF
export const load = (url, context, nextLoad) => {
if (url.endsWith('1.cjs')) {
return {
source: 'require("./addon.node")',
format: 'commonjs',
shortCircuit: true
}
}
return nextLoad(url, context);
}
EOF
@koshic ➜ /workspaces/addon-test (main) $ node --loader ./hook1.mjs 1.cjs
(node:2392) ExperimentalWarning: `--experimental-loader` may be removed in the future; instead use `register()`:
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("./hook1.mjs", pathToFileURL("./"));'
(Use `node --trace-warnings ...` to show where the warning was created)
node:internal/modules/cjs/loader:1327
return process.dlopen(module, path.toNamespacedPath(filename));
^
Error: /workspaces/addon-test/1.node: file too short
at Module._extensions..node (node:internal/modules/cjs/loader:1327:18)
at Module.load (node:internal/modules/cjs/loader:1091:32)
at Module._load (node:internal/modules/cjs/loader:938:12)
at require (node:internal/modules/esm/translators:185:30)
at Object.<anonymous> (/workspaces/addon-test/1.cjs:1:1)
at loadCJSModule (node:internal/modules/esm/translators:208:3)
at ModuleWrap.<anonymous> (node:internal/modules/esm/translators:234:7)
at ModuleJob.run (node:internal/modules/esm/module_job:217:25)
at async ModuleLoader.import (node:internal/modules/esm/loader:316:24)
at async loadESM (node:internal/process/esm_loader:34:7) {
code: 'ERR_DLOPEN_FAILED'
}
Node.js v20.8.0
@koshic ➜ /workspaces/addon-test (main) $ Error: /workspaces/addon-test/1.node: file too short - 1.node resolved, but can't be loaded because empty. It's ok. Case 2 - 'load' hook returns CJS source with absolute path (you can insert path suitable for your machine) inside require. Doesn't work as expected. cat > hook2.mjs << EOF
export const load = (url, context, nextLoad) => {
if (url.endsWith('1.cjs')) {
return {
source: 'require("/workspaces/addon-test/1.node")',
format: 'commonjs',
shortCircuit: true
}
}
return nextLoad(url, context);
}
EOF
@koshic ➜ /workspaces/addon-test (main) $ node --loader ./hook2.mjs 1.cjs
(node:5358) ExperimentalWarning: `--experimental-loader` may be removed in the future; instead use `register()`:
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("./hook2.mjs", pathToFileURL("./"));'
(Use `node --trace-warnings ...` to show where the warning was created)
node:internal/process/esm_loader:40
internalBinding('errors').triggerUncaughtException(
^
TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".node" for /workspaces/addon-test/1.node
at new NodeError (node:internal/errors:406:5)
at Object.getFileProtocolModuleFormat [as file:] (node:internal/modules/esm/get_format:99:9)
at defaultGetFormat (node:internal/modules/esm/get_format:142:36)
at defaultLoad (node:internal/modules/esm/load:120:20)
at nextLoad (node:internal/modules/esm/hooks:833:28)
at load (file:///workspaces/addon-test/hook2.mjs:10:12)
at nextLoad (node:internal/modules/esm/hooks:833:28)
at Hooks.load (node:internal/modules/esm/hooks:416:26)
at MessagePort.handleMessage (node:internal/modules/esm/worker:168:24)
at [nodejs.internal.kHybridDispatch] (node:internal/event_target:807:20) {
code: 'ERR_UNKNOWN_FILE_EXTENSION'
}
Node.js v20.8.0
@koshic ➜ /workspaces/addon-test (main) $ TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".node" for /workspaces/addon-test/1.node Why do we have different results for the same file depends on path form, relative / absolute? Without loader, in plain .cjs file, both variants work without difference. |
Give
user-app.mjs: import { default as x } from './user-x.js';
setTimeout(() => {
console.log('done');
}, 1000); user-x.js: const m = process.env.SEGFAULT === 'no' ? '@contrast/distringuish' : '@contrast/agent-lib';
console.log(`requiring ${m}`);
module.exports = require(m); With the resolve() hook as it is, I try to do supply the required .node file in
If you change the name of
What am I supposed to do in the |
@koshic can you try with: cat > hook3.mjs << 'EOF'
export const load = (url, context, nextLoad) => {
if (url.endsWith('1.cjs')) {
return {
source: `require("node:module").createRequire(${JSON.stringify(url)})("./1.node")`,
format: 'commonjs',
shortCircuit: true,
}
}
return nextLoad(url, context);
}
EOF |
@aduh95 |
@aduh95 - here's one additional piece of data - your solution causes the loader to be called again with the url 'node:module' but the format I made the changes to the esm-loader-only.mjs#load() function in the patch-o-matic repo and the .node module loads correctly: if (format !== 'commonjs' && format !== 'module') {
console.debug({ load: url, format }, 'Skipping rewrite; loading original code');
if (path.extname(filename) === '.node') {
return {
source: `require("module").createRequire("${url}")("${filename}");`,
format: 'commonjs',
shortCircuit: true,
};
}
return nextLoad(url, context);
} I had to remove the
|
If anyone is interested in tracking down the segfault, /~https://github.com/bmacnaughton/patch-o-matic, reproduces it. So far, I have not been able to produce a segfault unless I load I'm happy to close this if that's what is appropriate. I can always reopen or create a new issue if and when I am able to further isolate the issue. |
None of us is going to debug someone else's binary blob so I'll be so free to close this. You should report this to them. |
@bmacnaughton were you able to find a fix or workaround? I'm having this issue as well on node 20 with esm. Same setup was working on node 19 and the segfault started after upgrading (only while debugging). |
I can help with this next week; am traveling now. |
The abbreviated answer is that the ESM
This was the solution suggested by aduh95 and koshic, both of whom were great help. |
You wrote this in a custom loader and did something like this?
Thank you! |
yes, that code is in the fwiw, |
oops - no, that's what I use with the |
Version
v20.8.0
Platform
Linux wsl 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Subsystem
module loaders
What steps will reproduce the bug?
Unfortunately, I am unable to reproduce this in a simple application; I have spent a lot of time trying to do so but am obviously missing something. I'll try to explain what I see.
we execute as a loader,
node --loader our-loader user-application
.Prior to node 20, the loaders ran in the same thread as main. Our-loader initializes "data", which includes functions, in
global
. It then transforms user-application such that the "data" inglobal
is referenced. Because the loader thread does not shareglobal
with the main thread, an error is thrown when referencing properties in "data". (This is easily reproducible, but I have not been able to create a small test case that reproduces the segfault.)The "SyntaxError" that is thrown is our check to verify that the required "data" is in
global
. If I fake up "data" usingglobalPreload
orregister()
, it will get past this check but will die when "data" is really needed.How often does it reproduce? Is there a required condition?
Every time on node 20.0+
What is the expected behavior? Why is that the expected behavior?
No segfault.
What do you see instead?
Additional information
I submitted this not-very-good bug report because I've wrestled with this for a while now and have not been able to create a simple test case. The fact that it segfaults led me to submit this anyway with the hopes that someone that's been working on this can say "ok, this is the problem".
I have tried using
--import
rather than--loader
and get the same result. I'm working on changing our code to useregister()
but doesn't seem likely to address the issue ofglobal
not being available in a worker-thread.I am very interested in any suggestions on how to handle this. Maybe we need to start forcing
import
and/orrequire
into the transformed user-application in order to reproduceglobal
? We've tried to avoid that but maybe it's no longer possible.The text was updated successfully, but these errors were encountered: