Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulnerability: Memory Overflow when sending big files #5412

Closed
UnlimitedBytes opened this issue Jul 7, 2022 · 6 comments · Fixed by #5442
Closed

Vulnerability: Memory Overflow when sending big files #5412

UnlimitedBytes opened this issue Jul 7, 2022 · 6 comments · Fixed by #5442

Comments

@UnlimitedBytes
Copy link

UnlimitedBytes commented Jul 7, 2022

UPDATE

I updated the title and added some more information on the importance of this issue here: #5412 (comment)

Describe the bug

⚠ Disclaimer

First of all I'm not a native english speaker so please forgive my sometimes bad english! ❤

Second of all I'm not entirely sure if this issue belongs to sveltejs/kit or more to sveltejs/kit/adapter-node but as they both are on the same repository I'm posting this issue here anyway.

Third of all I know there may be people saying this is more of a feature request than a bug. But in my opinion it's a bug as it's blocking me from using sveltekit in this project.

🐞 The bug

Sveltekit processes any multipart/form-data that's hitting an POST endpoint completly in RAM before reaching out to the endpoint. This let's me and anyone who wants to enable file uploading run into 2 related but different problems.

  1. It's impossible to check if a file is acceptable (Filesize, Filetype, etc.) from the endpoint before the entire file is already in memory. Even by utilizing a hook before hitting the real endpoint the entire form-data gets parsed and loaded into memory before the hook is called.
  2. When transmitting large files (> 1GiB, 5GiB, 10GiB, etc.) the RAM will overflow leading to an out of memory error. This normaly can be avoided by streaming the files while parsing the form-data which leads to way lower ram utilization. Anyway because the entire form-data is directly completly read into memory by hitting the post handler enabled endpoint this can't be done in Sveltekit. As above a hook will not solve the problem as the entire form-data gets parsed and loaded into memory before the hook is called.

🌟A solution

I don't know if this is acceptable or not (maybe not because it will most likely introduce a breaking change) but a very simple solution for this problem would be to let the user decide if the body should be parsed for them. This in my opinion is the "cleanest way" of fixing this. As we already have the request .formData(), .text(), .json(), etc. functions in the Web APIs Standard.

Little side fact to this: first I thought my code was doing this behaviour because I used a .formData() which for me was the place to do such a thing but currently it's doing it regardless of the existence of it.

Reproduction

Skeleton Project

A skeleton project which generates a ~1GiB big file inside the browser and sends it to the node server can be found on: /~https://github.com/UnlimitedBytes/sveltekit-issue-overflow-formdata

Can't recommend

Anyway I do not recommend using this to test the issue a much better way (which won't kill your browser ram too) would be to generate a random file on your os and upload it via a post form!

Better way

To generate a random file with ~1GiB on different operating systems use:

# Create 1 GiB file on windows
fsutil file createnew bigfile.bin 1073741824

# Create 1 GiB file on ubuntu/debian
head -c 1073741824 </dev/urandom >bigfile.bin

# Create 1 GiB file on macos
mkfile -n 1g bigfile.bin

A simple form to upload a file should look like so:

<form action="/upload" method="POST" enctype="multipart/form-data">
  <input type="file" name="file" /><br />
  <button type="submit">Send File</button>
</form>

EDIT: This was a false assumption it is not needed to do anything with the data you can directly dump the data to the garbage collector! This only will hold the data some time longer for better visibility but the RAM usage will quickly increase anyway until the garbage collector kicks in.
In order to trigger the issue a POST endpoint handler is needed! So create an upload.js and/or an upload.svelte file with the following upload.js content:

// upload.js
const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

export async function post() {
	// Give enough time to check memory
	await delay(10000);
	return {};
}

Logs

![windows task manager showing node process on around one GiB RAM usage](https://i.imgur.com/Jmts2Wz.png)

System Info

System:
  OS: Windows 10 10.0.19044
  CPU: (8) x64 AMD Ryzen™ 7 5800X3D @ 4.50GHz
  Memory: 6.05 GB / 32.00 GB
Binaries:
  Node: 18.4.0 - C:\Program Files\nodejs\node.EXE
  Yarn: 1.22.19 - ~\AppData\Roaming\npm\yarn.CMD
  npm: 8.13.1 - ~\AppData\Roaming\npm\npm.CMD
Browsers:
  Edge: Spartan (44.19041.1266.0), Chromium (103.0.1264.44)
  Internet Explorer: 11.0.19041.1566
npmPackages:
  @sveltejs/adapter-node: next => 1.0.0-next.78
  @sveltejs/kit: next => 1.0.0-next.360
  svelte: ^3.44.0 => 3.49.0
  vite: ^2.9.13 => 2.9.13

Severity

blocking all usage of SvelteKit
MAJOR - SECRUITY ISSUE

Additional Information

No response

@chvanam
Copy link

chvanam commented Jul 8, 2022

There are several open issues and discussions around this topic. I think the core team is currently working on improving this. Endpoints just recently received the ability to get a ´ReadableStream´ from a request which you can use to read files uploaded chunk by chunk without filling your memory.

But again there is not yet a "recommended" (as present in the documentation) way to do this.

@Rich-Harris
Copy link
Member

I would guess the issue is here:

return new ReadableStream({
start(controller) {
req.on('error', (error) => {
controller.error(error);
});
let size = 0;
req.on('data', (chunk) => {
size += chunk.length;
if (size > length) {
controller.error(new Error('content-length exceeded'));
}
controller.enqueue(chunk);
});
req.on('end', () => {
controller.close();
});
}
});

We're blinding enqueueing data as soon as it's available — we probably need to use pull instead of start

@Rich-Harris
Copy link
Member

(I have to add that it brings me great pleasure that someone going by the handle @UnlimitedBytes is responsible for this particular issue)

@UnlimitedBytes
Copy link
Author

Answer to #5412 (comment)

There are several open issues and discussions around this topic. I think the core team is currently working on improving this. Endpoints just recently received the ability to get a ´ReadableStream´ from a request which you can use to read files uploaded chunk by chunk without filling your memory.

But again there is not yet a "recommended" (as present in the documentation) way to do this.

Hey, first thanks for the fast response.
Could you maybe link the open issues/discussions around this topic so I can get a look into them?
The only ones I was able to find regarding to this issue or related topics to this issue are described below but are CLOSED.

  1. Handle file uploads #70 which describes that the body-parser should parse mutlipart/form-data and got closed for Breaking: use Request and Response objects in endpoints and hooks #3384 and Handling streaming request/response bodies #3419.
  2. Breaking: use Request and Response objects in endpoints and hooks #3384 which describes a breaking change that introduced Request and Response objects in endpoints and hooks, lay the foundation for streaming request/response bodies.
  3. Handling streaming request/response bodies #3419 which describes that it should be possible to stream the request body and got closed by stream request bodies #5291.
  4. stream request bodies #5291 which introduced the return of the request body as ReadableStream
  5. Ability to stream data (for non-serverless adapters like adapter-node) #1563 which describes that there should be a way to stream data from requests / to responses

All the above listed issues/pull requests are related to this issue but already closed. Also they all describe how it should be implemented or implement request body / response body streaming. All of them are good on its own but don't directly explain the issue that the body gets parsed before there is an possibility to stream it.

@UnlimitedBytes
Copy link
Author

Answer to #5412 (comment)

I would guess the issue is here:

return new ReadableStream({
start(controller) {
req.on('error', (error) => {
controller.error(error);
});
let size = 0;
req.on('data', (chunk) => {
size += chunk.length;
if (size > length) {
controller.error(new Error('content-length exceeded'));
}
controller.enqueue(chunk);
});
req.on('end', () => {
controller.close();
});
}
});

We're blinding enqueueing data as soon as it's available — we probably need to use pull instead of start

This looks like the right place as the linked code will get generated to the following inside build/handler-{hash}.js:

if (length > 0) {
  let offset = 0;
  req.on('data', (chunk) => {
    const new_len = offset + Buffer.byteLength(chunk);

    if (new_len > length) {
      return reject({
        status: 413,
        reason: 'Exceeded "Content-Length" limit'
      });
    }

    data.set(chunk, offset);
    offset = new_len;
  });
} else {
  req.on('data', (chunk) => {
    const new_data = new Uint8Array(data.length + chunk.length);
    new_data.set(data, 0);
    new_data.set(chunk, data.length);
    data = new_data;
  });
}

req.on('end', () => {
  fulfil(data);
});

which when commented out will not trigger the issue and the node process will only run on around 40MB RAM which is completly expected. When not commented out the issue triggers and the node process runs on around 1GB RAM.

Anyway the suggested fix using pull insteadof start still triggers the issue.

@UnlimitedBytes UnlimitedBytes changed the title Memory Overflow when sending big files Vulnerability: Memory Overflow when sending big files Jul 8, 2022
@UnlimitedBytes
Copy link
Author

UnlimitedBytes commented Jul 8, 2022

AT THIS POINT THIS IS A MAJOR ISSUE WHICH MAKES EVERY SVELTEKIT SITE VULNERABLE!

After some more researching on this issue I updated the title to reflect the importance of this issue. As
THIS MAKES EVERY SVELTEKIT APPLICATION VULNERABLE TO A FULL SERVER CRASH!

Due to the fact that this bug is not related to a specific adapter (only adapter-static doesn't have this problem) it's possible to overload the entire memory of a server by just sending a file that is bigger than the servers RAM. When not protected by an Reverse-Proxy (NGINX, HAPROXY, etc.) to prevent receiving mutlipart/form-data each SvelteKit Server can be used to CRASH the entire underlying physical server by overloading the node process until it eats every piece of RAM and the OS crashes.

The bug can be mitigated by services like Vercel, CloudFlare, Netlify, etc. because they spin up multiple server instances for an app. Anyway the server that's handling the specific request will fail and die tho.

For any deployment solution where no Serveless Plattform at planet scale is involved this crashes the entire underlying server when enough data gets send!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants