-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster chunkedStream_getByte() #5005
Faster chunkedStream_getByte() #5005
Conversation
totally makes sense, nice find |
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.22.172.223:8877/1f75c93a1085574/output.txt |
From: Bot.io (Linux)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/0d238b05d33328b/output.txt |
From: Bot.io (Windows)SuccessFull output at http://107.22.172.223:8877/1f75c93a1085574/output.txt Total script time: 21.06 mins
|
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/0d238b05d33328b/output.txt Total script time: 23.52 mins
|
/botio-linux preview |
From: Bot.io (Linux)ReceivedCommand cmd_preview from @timvandermeij received. Current queue size: 0 Live output at: http://107.21.233.14:8877/1bd4974acd87bba/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/1bd4974acd87bba/output.txt Total script time: 0.73 mins Published
|
Looks really good to me. |
this.ensureByte(pos); | ||
} | ||
this.pos++; | ||
return byte; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this return 0 if ensureByte was needed? There should be a byte = this.bytes[pos]; after the ensureByte.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the byte at position pos
has already been loaded, then it is correct to return 0
at return byte
.
However, if this.bytes[pos]
has not been loaded yet, then this.ensureByte(pos)
will throw an exception and no value will be returned at all. I admit that this is not so obvious from the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could rename ensureByte? (and then also fix the name of the anon function which is right now ensureRange)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fkaelberer Perhaps it would be a good idea to add a comment here about ensureByte
throwing an exception if the byte isn't loaded, to avoid confusion.
@samlh ensureByte
doesn't modify this this.bytes
in any way, please see: /~https://github.com/mozilla/pdf.js/blob/master/src/core/chunked_stream.js#L94.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could rename ensureByte? (and then also fix the name of the anon function which is right now ensureRange)
This does not seem relevant to the current PR.
(Furthermore, I don't understand what's wrong (or unclear) with the currently used names. E.g. ensureByte
ensures that the byte we're trying to get actually exists, so that name looks fine to me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the projects i have seen so far, ensure had the meaning of getOrCreate, which in this context would be: if its there ok, otherwise load it.
I personally would prefer "check" or "verify" as name, but thats just bikeshedding :)
Added a text comment to the code. |
Yeah, I should have looked a bit more closely. In my opinion, the naming is fine, I just (thoughtlessly) assumed blocking instead of exception/retry. Thanks for humoring my confusion. |
I added another commit that avoids calling getByte() repeatedly in loops. |
i reviewed it and looks good to me. makes good sense and I would like to see it merged. |
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.22.172.223:8877/022c09f82d50fda/output.txt |
From: Bot.io (Linux)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/9cf188a9e4183f9/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/9cf188a9e4183f9/output.txt Total script time: 36.01 mins
|
Faster chunkedStream_getByte()
Thank you! |
Even though #4966 accelerated
ensureRange()
a lot, it is faster to avoid the check altogether. If the byte array is non-zero at the requested position, the data must have been loaded already, so we can skip the range check.This speeds up some PDFs considerably, for example, page 11 of the tracemonkey paper now loads 20-30% faster.