Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS - Add the basic architecture to be able to execute embedded js #12432

Merged
merged 1 commit into from
Oct 23, 2020

Conversation

calixteman
Copy link
Contributor

As a first step, we (mozilla) want to execute safely the embedded javascript for the PDF.js embedded in Firefox.
A second step will be to add a sandbox for the webextension version.
So, in order to accomplish the first step, there is a patch in review on m-c:
https://phabricator.services.mozilla.com/D91746
to add a sandbox with no privileges to eval js.
And here's the basic stuff to use it.
I implemented only few things to be able to test using the form:
https://www.guidedesimpots.lu/wp-content/uploads/2019/02/160F-2019.pdf
which has basic actions to format correctly numbers (with 2 digits).
Few explanations:

  • all the Field objects, app, console, ... live in the sandbox it's why some js code is generated in scripting_api/code_gen.js and used to init the sandbox. All the intermediary stuff which is not in the api shouldn't be accessibe.
  • all the objects available (Field, app, ...) are wrapped in a proxy (see scripting_api/proxy.js) in order to control the access to different properties: all the properties starting with a _ are private.
  • the script may add some properties starting a _, the proxy handle that in adding them in the _expandos property (see scripting_api/pdf_object.js)
  • all the generated code use class.toString method which mean that we've the true objects and we can easily test them in pdf.js
  • and we can easily test the behaviour of the objects in eval the generate code.

In order to achieve the second step, we plan to use quickjs (/~https://github.com/yurydelendik/quickjs) compiled with emscripten but of course any other good solution is welcome.

@calixteman calixteman force-pushed the scripting_api branch 4 times, most recently from 2832e17 to 0d08979 Compare October 1, 2020 14:31
@calixteman
Copy link
Contributor Author

This patch is ready for review and is probably the more import piece.
Once it's merged we'll ask for a review of the patch to create the Sandbox on firefox's side.

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick first look, but it's a lot to oversee. I would really want to suggest to either cut this PR up in much smaller chunks with links to the relevant specifications, or remove a lot of the now unused parts (all the stubs, et cetera) because as-is it's quite hard to review due to the large number of properties, methods, et cetera.

Aside from that, the general approach seems OK to me.

gulpfile.js Outdated
@@ -1244,7 +1244,7 @@ function buildLib(defines, dir) {
return merge([
gulp.src(
[
"src/{core,display,shared}/*.js",
"src/{core,display,scripting_api,shared}/*.js",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about creating another folder for this since the folder structure as-is was carefully chosen to indicate if code is exclusively for the core/display layers or is shared. It looks like the scripting code is shared? If so, maybe move the directory inside the shared directory, and maybe call it scripting instead to avoid confusion with the normal PDF.js API? I'd like to also hear other opinions before doing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate folder is good here. Though I'm leaning towards us not including these files in the bundled pdf.js file. That's going to make pdf.js file grow and the forms scripting api will have to be parsed regardless if it is used and it will not actually be used in the regular js context. Can we bundle them into a separate file and just load it as text since we're eval'ing it anyway?

* limitations under the License.
*/

class AFormat {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the name AFormat come from? Is it in a specification somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was tired when I changed that...
So the "correct" name is AForm because all theses functions lived in a file (distributed with Acrobat few years ago) called AForm.js (see http://www.sfu.ca/~wcs/ForGraham/Aladdin%20stuff/Acrobat%20Reader%205.0/Contents/MacOS/JavaScripts/AForm.js).
There's no real documentation about those: only few indications here https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/FormsAPIReference.pdf#page=119.
In pdfium, these functions are implemented in a file called cjs_publicmethods.cpp.
So I guess the specifications are the code itself as found in AForm.js.
I implemented them here:
/~https://github.com/calixteman/pdf.js/blob/sandbox_experimental/src/scripting_api/aformat.js
but it's for an other PR.

this._util = util;
}

AFNumber_Format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here: this seems like an odd name. Why is it named like this? And are those parameters documented somewhere (since bCurrencyPrepend does not seem to be used)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my answer above.
About the current implementation: it's just a very basic one.
My goal here with this PR is just to set up the basis and so I implemented it quickly to have something which works.

@calixteman
Copy link
Contributor Author

I fixed the first nits and remove all the stubs.
Be aware that I've a lot of other PRs to do: my current work is here /~https://github.com/calixteman/pdf.js/tree/sandbox_experimental/src/scripting_api

gulpfile.js Outdated
@@ -1244,7 +1244,7 @@ function buildLib(defines, dir) {
return merge([
gulp.src(
[
"src/{core,display,shared}/*.js",
"src/{core,display,scripting_api,shared}/*.js",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate folder is good here. Though I'm leaning towards us not including these files in the bundled pdf.js file. That's going to make pdf.js file grow and the forms scripting api will have to be parsed regardless if it is used and it will not actually be used in the regular js context. Can we bundle them into a separate file and just load it as text since we're eval'ing it anyway?

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking more manageable than before, so thank you for doing that.

@calixteman calixteman force-pushed the scripting_api branch 3 times, most recently from 5597077 to 92a1897 Compare October 21, 2020 16:53
@brendandahl
Copy link
Contributor

/botio-linux test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://54.67.70.0:8877/e61da9e6dfeb09c/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/e61da9e6dfeb09c/output.txt

Total script time: 26.48 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/e61da9e6dfeb09c/reftest-analyzer.html#web=eq.log

@brendandahl
Copy link
Contributor

/botio-windows test

1 similar comment
@brendandahl
Copy link
Contributor

/botio-windows test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b1d497fd614b74f/output.txt

@brendandahl brendandahl merged commit 1eaf9c9 into mozilla:master Oct 23, 2020
@nmtigor nmtigor mentioned this pull request Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants