Skip to content

Commit

Permalink
Update pkgdown documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
georgestagg committed Sep 9, 2024
1 parent d6ef78c commit 00dfe4e
Show file tree
Hide file tree
Showing 5 changed files with 97 additions and 23 deletions.
3 changes: 2 additions & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
url: https://r-wasm.github.io/rwasm/
template:
bootstrap: 5

deploy:
install_metadata: true
13 changes: 13 additions & 0 deletions inst/pkgdown.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
pandoc: '3.2'
pkgdown: 2.0.9.9000
pkgdown_sha: 34ee692e4ce10c8abfb863cc782da771838558f7
articles:
github-actions: github-actions.html
mount-fs-image: mount-fs-image.html
mount-host-dir: mount-host-dir.html
rwasm: rwasm.html
tar-metadata: tar-metadata.html
last_built: 2024-09-04T08:58Z
urls:
reference: https://r-wasm.github.io/rwasm/reference
article: https://r-wasm.github.io/rwasm/articles
75 changes: 53 additions & 22 deletions vignettes/mount-fs-image.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,47 +7,82 @@ vignette: >
%\VignetteEncoding{UTF-8}
---

## Introduction
The Emscripten WebAssembly (Wasm) environment provides a virtual filesystem (VFS) which supports the concept of *mounting*. With this, an entire file and directory structure can be packaged into a filesystem image, efficiently making individual files or entire R package libraries available for use in webR.

The Emscripten WebAssembly environment provides a virtual filesystem (VFS) which supports the concept of *mounting*. With this, an entire file and directory structure can be packaged into a filesystem image to be loaded and mounted at runtime by WebAssembly (Wasm) applications. We can take advantage of this interface to efficiently mount R package libraries, pre-packaged and containing potentially many related R packages, in the VFS accessible to webR.
## Create filesystem images

## Building an R package library
### Emscripten's `file_packager` tool

To build an R package library image we must first build one or more Wasm R packages using `add_pkg()`. As an example, let's build a package with a few hard dependencies. Ensure that you are running R in an environment with access to Wasm development tools^[See the "Setting up the WebAssembly toolchain" section in `vignette("rwasm")` for further details.], then run:
The [`file_packager`](https://emscripten.org/docs/porting/files/packaging_files.html#packaging-using-the-file-packager-tool) tool, provided by Emscripten, takes in a directory structure as input and produces a webR compatible filesystem image as output. The [`file_packager`](https://emscripten.org/docs/porting/files/packaging_files.html#packaging-using-the-file-packager-tool) tool may be invoked from the [rwasm](https://r-wasm.github.io/rwasm/) package:

```{r eval=FALSE}
rwasm::add_pkg("dplyr")
> rwasm::file_packager("./input", out_dir = ".", out_name = "output")
```

After the build process has completed, the new `repo` directory contains a CRAN-like package repository with R packages build for Wasm.
It can also be invoked directly using its CLI^[See the [`file_packager`](https://emscripten.org/docs/porting/files/packaging_files.html#packaging-using-the-file-packager-tool) Emscripten documentation for details.], if you prefer:

Next, run the following to build an Emscripten VFS image:
```bash
$ file_packager output.data --preload ./input@/ \
--separate-metadata --js-output=output.js
```

In the above examples, the files in the directory `./input` are packaged and an output filesystem image is created^[When using the `file_packager` CLI, a third file named `output.js` will also be created. If you only plan to mount the image using webR, this file may be discarded.] consisting of a data file, `output.data`, and a metadata file, `output.js.metadata`.

To prepare for mounting the filesystem image with webR, ensure that both files have the same basename (in this example, `output`). The resulting URLs or relative paths for the two files should differ only by the file extension.

#### Compression

Filesystem image `.data` files may optionally be `gzip` compressed prior to deployment. The file extension for compressed filesystem images should be `.data.gz`, and compression should be indicated by setting the property `gzip: true` on the metadata JSON stored in the `.js.metadata` file.

**NOTE**: Loading compressed VFS images requires at least version 0.4.1 of webR.

### Mount `.tar` archives as a filesystem image

Archives in `.tar` format, optionally gzip compressed as `.tar.gz` or `.tgz` files, can also be used as filesystem images by pre-processing the `.tar` archive using the `rwasm::add_tar_index()` function. The function reads archive contents and appends the required filesystem metadata to the end of the `.tar` archive data in a way that is understood by webR. For further information about the format see the [Technical details for .tar archive metadata](tar-metadata.html) article.

```{r eval=FALSE}
rwasm::make_vfs_library()
> rwasm::add_tar_index("./path/to/archive.tar.gz")
# Appending virtual filesystem metadata for: ./path/to/archive.tar.gz
```

By default, this function will create a new directory named `vfs` if it does not exist. The files `vfs/library.data` and `vfs/library.js.metadata` together form an Emscripten filesystem image containing an R package library consisting of all the packages previously added to the CRAN-like repository in `repo` using `add_pkg()`.
Once processed by `rwasm::add_tar_index()`, the `.tar` archive can be deployed and used directly as a filesystem image.

### Packaging arbitrary data
## Mounting filesystem images

It is also possible to package an arbitrary data directory as an Emscripten filesystem image using the `file_packager()` function:
When running in a web browser, the [`webr::mount()`](https://docs.r-wasm.org/webr/latest/api/r.qmd#mount) function downloads and mounts a filesystem image from a URL source, using the `WORKERFS` filesystem type.

```{r eval=FALSE}
rwasm::file_packager("./some/data/directory", out_name = "output_image.data")
webr::mount(
mountpoint = "/data",
source = "https://example.com/output.data"
)
```

Again, this function writes output filesystem images to the `vfs` directory by default.
Filesystem images should be deployed to static file hosting^[e.g. GitHub Pages, Netlify, AWS S3, etc.] and the resulting URL provided as the source argument. The image will be mounted in the virtual filesystem under the path given by the `mountpoint` argument. If the `mountpoint` directory does not exist, it will be created prior to mounting.

### Compression
When running under Node.js, the source may also be provided as a relative path to a filesystem image on disk.

The `add_pkg()`, `make_vfs_library()`, `file_packager()` and other related functions support the `compression` argument. The default value is `FALSE`, but when `TRUE` VFS images will be `gzip` compressed for deployment. For some types of package content, the savings in file size with compression can be significant.
To test filesystem images before deployment, serve them using a local static webserver. See the Local Testing section below for an example using `httpuv::runStaticServer()` in R.

**NOTE**: Loading compressed VFS images requires at least version 0.4.1 of webR.
## Building an R package library image

## Mounting filesystem images
A collection of R packages can be collected and bundled into a single filesystem image for mounting.

To build an R package library image we must first build one or more Wasm R packages using `add_pkg()`. As an example, let's build a package with a few hard dependencies. Ensure that you are running R in an environment with access to Wasm development tools^[See the "Setting up the WebAssembly toolchain" section in `vignette("rwasm")` for further details.], then run:

```{r eval=FALSE}
rwasm::add_pkg("dplyr")
```

After the build process has completed, the new `repo` directory contains a CRAN-like package repository with R packages build for Wasm.

The filesystem image(s) should now be hosted by a web server so that it is available at some URL. Such a URL can then be passed to `webr::mount()` to be made available on the virtual filesystem for the Wasm R process.
Next, run the following to build an Emscripten VFS image:

```{r eval=FALSE}
rwasm::make_vfs_library()
```

By default, this function will create a new directory named `vfs` if it does not exist. The files `vfs/library.data` and `vfs/library.js.metadata` together form an Emscripten filesystem image containing an R package library consisting of all the packages previously added to the CRAN-like repository in `repo` using `add_pkg()`.

### Local testing

Expand Down Expand Up @@ -92,7 +127,3 @@ library(dplyr)
#>
#> intersect, setdiff, setequal, union
```

### Deployment

The filesystem image files should be deployed to the static file hosting service of your choice, so that they are available for download anywhere. See the "Deployment to static hosting" section in `vignette("rwasm")` for an example of how to host static files with GitHub pages, substituting the `repo` directory for the `vfs` directory containing Emscripten filesystem images.
2 changes: 2 additions & 0 deletions vignettes/mount-host-dir.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ vignette: >

When running under Node.js, the Emscripten WebAssembly environment can make available the contents of a directory on the host filesystem. In addition to providing webR access to external data files, a pre-prepared R package library can be mounted from the host filesystem. This avoids the need to download potentially large R packages or filesystem images over the network.

See the [webR documentation for more details](https://docs.r-wasm.org/webr/latest/mounting.html#mount-an-existing-host-directory) on mounting host directories under Node.js.

## Building an R package library

To build an R package library, we must first build one or more Wasm R packages using `add_pkg()`. As an example, let's build a package with a few hard dependencies. Ensure that you are running R in an environment with access to Wasm development tools^[See the "Setting up the WebAssembly toolchain" section in `vignette("rwasm")` for further details.], then run:
Expand Down
27 changes: 27 additions & 0 deletions vignettes/tar-metadata.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: "Technical details for .tar archive metadata"
output: rmarkdown::html_document
vignette: >
%\VignetteIndexEntry{Technical details for .tar archive metadata}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

The `rwasm::add_tar_index()` function appends Emscripten filesystem metadata to an (optionally gzip compressed) `.tar` archive.
The resulting output can be directly mounted by webR to the virtual filesystem, making the content of the archive available to the WebAssembly R process.

See the [Mounting filesystem images](mount-fs-image.html) article for more information about mounting filesystem images.

## Archive data layout

A `.tar` archive that includes Emscripten filesystem metadata has the data layout given below. The resulting `.tar` file may be gzip compressed, with file extension `.tar.gz` or `.tgz`.

| Field | Size | Description |
|-|---|-------------|
| 0 | Variable | Standard `.tar` data, including end-of-archive marker. |
| 1 | Variable | JSON metadata, UTF8 encoded, padded with `0x00` to 4 byte boundary. |
| 2 | 4 bytes | Magic bytes: The string `"webR"`, UTF8 encoded (`0x77656252`). |
| 3 | 4 bytes | Reserved, currently `0x00000000`. |
| 4 | 4 bytes | Offset of JSON metadata (field 1), in units of 512-byte blocks. Signed integer, big endian. |
| 5 | 4 bytes | Length of JSON metadata, in bytes. Signed integer, big endian. |
Table: Data layout for a `.tar` archive with filesystem metadata.

0 comments on commit 00dfe4e

Please sign in to comment.