-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignore buffer in browsers #67
Conversation
Thank you for the PR, this looks good! Out of curiosity, is the second commit (using |
I'm not 100% sure, but I think (i guess) it's more preferable to use a variable and declare where |
Still wish Buffer could be removed completely... although, you have already expressed that you do not want to do that #44 I'm not sure I could persuade you to drop it. |
The reality is that it is other way around, using Node's |
Recent changes in newer NodeJS by @anonrig in nodejs/node#45701 might turn this around.
On another note: it seems like they made TextDecoder faster than |
The moment TextEncoder.encodeInto is faster (than Buffer.write), I will gladly switch!
utf8Write is basically just Buffer.write (saving an extra call), which is documented. |
Can you share a benchmark of this comparison in the nodejs/performance repo? I'll be happy to take a look at it. |
Here is my script that I have been using to test, annotated with the latest perf numbers I see on my computer with the different versions of node. Node v18 actually over doubled the performance of string encoding methods (awesome!), although it was across the board, so didn't really change the relative ranking. Node v19 seems to slightly regress in performance, but not a lot:
@anonrig Did you want me to file an issue in nodejs/performance? I would be glad to do that, if you think that would be helpful, although I hadn't really considered this a "bug", just the reality of different functions performance characteristics. |
Also, an additional data point, Bun can do the buffer.write in about 11ns, plain encodeInto in 20ns and subarray+encodeInto in 58ns (so buffer.write is actually much faster than node, subarray+encodeInto about the same as node). |
I'd appreciate it if you could share your findings, and I'll make it our priority to look over them in the next meeting. (And/or you're welcome to share them at the meeting with us) |
using the browserified buffer module was way slower: let { Buffer } = await import('https://esm.sh/buffer')
let str = document.body.outerHTML.repeat(4)
let size = new Blob([str]).size + 100 // 7116
let offset = 10
let u = new Uint8Array(size)
let b = Buffer.alloc(size)
let te = new TextEncoder()
let n = 10000
console.time('write')
for (let i = 0; i < n; i++) b.write(str, 10)
console.timeEnd('write')
console.time('encodeInto subarray')
for (let i = 0; i < n; i++) te.encodeInto(str, u.subarray(10))
console.timeEnd('encodeInto subarray')
fyi the browserified node:buffer don't have |
Yeah, definitely never want to use the Buffer shim, that's why cbor-x uses TextEncoder when Buffer is not available. BTW, are you also suggesting that checking for the presence of |
Nah, that was just something i took notice of when i tried running your bench test inside of a browser.
but if you think that checking for utf8Write existence would be good then, sure 🤷♂️ |
Hmm, have you also tried running your benchmark with a larger payload? when i tried a payload of 14806 bytes then there where barely any noticeable differences const res = await fetch('https://nodejs.org/')
const ab = (await res.arrayBuffer())
const text = await new Response(ab).text()
let u = new Uint8Array(ab.byteLength);
let b = Buffer.alloc(ab.byteLength);
let uf8Write = Buffer.prototype.utf8Write;
let te = new TextEncoder();
let n = 10000000
console.log(text.length)
console.time('write');
for (let i = 0; i < n; i++) b.write(text, 10)
console.timeEnd('write');
console.time('utf8Write');
for (let i = 0; i < n; i++) b.utf8Write(text, 10, 0xffffff)
console.timeEnd('utf8Write');
console.time('uf8Write.call');
for (let i = 0; i < n; i++) uf8Write.call(u, text, 10, 0xffffff)
console.timeEnd('uf8Write.call');
console.time('encodeInto subarray');
for (let i = 0; i < n; i++) te.encodeInto(text, u.subarray(10))
console.timeEnd('encodeInto subarray');
// note that this is on par with buffer.write, but is useless since you can't specify a starting index, subarray time needs to be included in a fair comparison
console.time('encodeInto');
for (let i = 0; i < n; i++) te.encodeInto(text, u)
console.timeEnd('encodeInto'); write: 975.146ms Node.js v19.4.0 |
Right, once the actual C-level encoder takes over, they are all doing the same thing. This is all about measuring the overhead to get the encoding going. And real-world data has a lot of small strings, so this overhead is extremely important for real performance. |
Buffer is optional, adding this will prevent eg: jsdeliver to automatically adding it for just using the
Buffer
keywordhttps://cdn.jsdelivr.net/npm/cbor-x@1.5.0/browser.js/+esm