-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zlib: Allow zlib.gunzip features with zlib.unzip #5884
Conversation
LGTM. CI: https://ci.nodejs.org/job/node-test-pull-request/2051/ I asked you in the other PR to remove the defines but that was before I'd seen this PR. |
Sorry, even if CI passes, I’ll have to update this with a somewhat different approach and more tests. |
Okay, added a new commit with tests for the case that even the magic bytes don’t come in a single chunk (I know it’s quite an unlikely edge case, but I guess even that should be supported). The new test will require the bugfix from #5883 to pass, though. |
07ed63a
to
fd10a48
Compare
This looks like something that should obviously be supported. The only relevant bytes here are the first two bytes of the file, right? When would these be in separate chunks? |
Yes, these are the only inspected bytes. And honestly, I have no idea how they might get split into separate chunks. 😄 But it could, because we don’t have control about where the data comes from, and I think that means either not supporting this feature at all or creating an edge case in which it does not work. |
I guess if someone set the chunksize to 1 when constructing a |
> zlib.createGunzip({chunkSize:1})
Error: Invalid chunk size: 1 I guess you’re talking about something like receiving |
Actually, your example is exactly what I was thinking of, but I'm happy to see that it isn't possible. But obviously one can construct pathological cases where this would happen, like passing one byte at a time to Anyway, I give this a big 👍, thanks for improving on this 😃 |
I think that should be covered by the test cases here, but I guess you can feel free to create more of them. And again, good to have you on board with this! 😄 |
@@ -236,6 +239,40 @@ class ZCtx : public AsyncWrap { | |||
ctx->err_ = deflate(&ctx->strm_, ctx->flush_); | |||
break; | |||
case UNZIP: | |||
next_expected_header_byte = ctx->strm_.next_in; | |||
|
|||
switch (ctx->gzip_id_bytes_read_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question: would it be necessary to check that ctx->strm_.avail_in > 0
before trying to read the first byte?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that vein, shouldn't there be a similar check for GZIP_HEADER_ID2?
Another question: what happens when the state machines sees GZIP_HEADER_ID1 followed by a byte that's not GZIP_HEADER_ID2? Is the first byte lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, you’re right – updated again. And no, no bytes are lost in that case; there is always the fall-through in the outer switch from case UNZIP:
to case INFLATE:
, so all input ends up being passed to inflate()
.
fd10a48
to
2598a0f
Compare
Rebased and updated with your suggestion; The tests are passing now that #5883 is merged. |
2598a0f
to
a388a5e
Compare
LGTM if @bnoordhuis is happy. |
LGTM with the style nit that comments should be capitalized and punctuated. |
Detect whether a gzip file is being passed to `unzip*` by testing the first bytes for the gzip magic bytes, and setting the decompression mode to `GUNZIP` or `INFLATE` according to the result. This enables gzip-only features like multi-member support to be used together with the `unzip*` autodetection support and thereby makes `gunzip*` and `unzip*` return identical results for gzip input again. Add a simple test for checking that features specific to `zlib.gunzip`, notably support for multiple members, also work when using `zlib.unzip`.
a388a5e
to
a8e1fb5
Compare
Updated the comments. :) |
Thanks Anna, landed in 2d7e316. |
Detect whether a gzip file is being passed to `unzip*` by testing the first bytes for the gzip magic bytes, and setting the decompression mode to `GUNZIP` or `INFLATE` according to the result. This enables gzip-only features like multi-member support to be used together with the `unzip*` autodetection support and thereby makes `gunzip*` and `unzip*` return identical results for gzip input again. Add a simple test for checking that features specific to `zlib.gunzip`, notably support for multiple members, also work when using `zlib.unzip`. PR-URL: #5884 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Detect whether a gzip file is being passed to `unzip*` by testing the first bytes for the gzip magic bytes, and setting the decompression mode to `GUNZIP` or `INFLATE` according to the result. This enables gzip-only features like multi-member support to be used together with the `unzip*` autodetection support and thereby makes `gunzip*` and `unzip*` return identical results for gzip input again. Add a simple test for checking that features specific to `zlib.gunzip`, notably support for multiple members, also work when using `zlib.unzip`. PR-URL: #5884 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Detect whether a gzip file is being passed to `unzip*` by testing the first bytes for the gzip magic bytes, and setting the decompression mode to `GUNZIP` or `INFLATE` according to the result. This enables gzip-only features like multi-member support to be used together with the `unzip*` autodetection support and thereby makes `gunzip*` and `unzip*` return identical results for gzip input again. Add a simple test for checking that features specific to `zlib.gunzip`, notably support for multiple members, also work when using `zlib.unzip`. PR-URL: #5884 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Pull Request check-list
make -j8 test
(UNIX) orvcbuild test nosign
(Windows) pass withthis change (including linting)?
test (or a benchmark) included?
Affected core subsystem(s)
zlib
Description of change
Detect whether a gzip file is being passed to
unzip*
by testing the first bytes for the gzip magic bytes, and setting the decompression mode toGUNZIP
orINFLATE
according to the result.This enables gzip-only features like multi-member support to be used together with the
unzip*
autodetection support and thereby makesgunzip*
andunzip*
return identical results for gzip input again.This is more or less a follow-up change to #5120.