webpage-capture

Capture the web in many ways using headless chrome

This program is an overlay of puppeteer which is designed to allow the easy extraction of single or multiple pages or sections, in multiple formats and in the fastest way possible.

Installation as module

If you want to use it inside your scripts, save and install it in your project dependencies.

npm install --save webpage-capture

Once it has done you can import webpage-capture in your scripts and start using it, refer to the usage section.

Installation as CLI

You can also use it from the command line using the cli module, installing it globally:

npm install -g webpage-capture-cli

Than you can start playing around with webcapture command and with the options using the built-in help typing: --help

Usage

First you have to import webpage-capture and initializire a new capturer with default or custom options.

import WebCapture from 'webpage-capture'
const capturer = new WebCapture()

(async () => {

  // Single input
  await capturer.capture('https://google.it')

  // Multiple inputs
  const res = await capturer.capture([
    '/~https://github.com/b4dnewz',
    '/~https://github.com/b4dnewz/webpage-capture'
  ])
  console.log(res);

})().catch(console.log)
    .then(capturer.close())

Don't forget to close the capturer once you have done, otherwise the headless browser instance will not disconnect correctly.

Instance Options

The constructor can also take options to set default values for all the subsequent captures:

const capturer = new WebCapture({
  // default options
})

outputDir

Type: String
Default value: process.cwd()

Specify a custom directory where place the captured files.

timeout

Type: Number
Default value: 30000

Specify the default page timeout to load a page (in ms).

headers

Type: Object

Set the headers to be used during all the requests.

viewport

Type: String, Object

Set the default viewport that is used when not specified.

debug

Type: Boolean
Default value: false

Run the script in headfull mode with a delay of 1 second so you can see what is doing.

launchArgs

Type: Array
Default value: []

Custom launch arguments to be passed to Chromium instance, if you are a linux user and you are experiencing some issues, you may want to disable sandbox or setup differently.

Methods

file(input, outputFilePath, [options])

Capture a screnshot of the given input and save it to the given outputFilePath.

Returns a Promise<void> that resolves when the screenshot is written.

buffer(input, [options])

Capture a screnshot of the given input.

Returns a Promise<Buffer> with the screenshot as binary.

base64(input, [options])

Capture a screnshot of the given input.

Returns a Promise<string> with the screenshot as Base64.

capture(inputs, [options])

Capture one or multiple input using the given options. For a complete list of options see below.

Returns a Promise<void> that resolves when the screenshot is written.

Capture Options

type

Type: String
Default value: png
Possible values: png, jpeg, pdf, html, buffer, base64

Select the capture output type.

timeout

Type: Number

The timeout to use when capturing input.

headers

Type: Object

An object of headers to use when capturing input.

await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  headers: {
    'x-powered-by': 'webpage-capture'
  }
})

waitFor

Type: Number

Wait for the specified time before capturing the page. (in milliseconds)

waitUntil

Type: String

Wait until the selector is visible into page.

fullPage

Type: Boolean

Capture the full scrollable page, not just the visible viewport.

scripts

Type: String[]

Inject scripts into the page.

await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  scripts: [
    'http://example.com/some/remote/file.js',
    './local-file.js',
    'document.body.style.backgroundColor = "hotpink";'
  ]
})

styles

Type: String[]

Inject styles into the page.

await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  styles: [
    'http://example.com/some/remote/file.css',
    './local-file.css',
    'body { background-color: "hotpink"; }'
  ]
})

viewport

Type: String, String[]

One or more viewport to capture.

// capture single viewport
await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  viewport: 'nexus-5'
})

// capture multiple viewports
await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  viewport: ['nexus-5', 'nexus-10']
})

viewportCategory

Type: String
Default value: false

Capture all viewports that match the category name.

// capture all mobile viewports
await capturer.capture('/~https://github.com/b4dnewz/webpage-capture', {
  viewportCategory: 'mobile'
})

License

Contributing

Create an issue and describe your idea
Fork the project (/~https://github.com/b4dnewz/webpage-capture/fork)
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Write some tests and run it (npm run test')
Publish the branch (git push origin my-new-feature)
Create a new Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
__tests__		__tests__
examples		examples
lib		lib
.babelrc		.babelrc
.editorconfig		.editorconfig
.eslintignore		.eslintignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

webpage-capture

Installation as module

Installation as CLI

Usage

Instance Options

outputDir

timeout

headers

viewport

debug

launchArgs

Methods

file(input, outputFilePath, [options])

buffer(input, [options])

base64(input, [options])

capture(inputs, [options])

Capture Options

type

timeout

headers

waitFor

waitUntil

fullPage

scripts

styles

viewport

viewportCategory

License

Contributing

About

Releases

Packages

Languages

License

b4dnewz/webpage-capture

Folders and files

Latest commit

History

Repository files navigation

webpage-capture

Installation as module

Installation as CLI

Usage

Instance Options

outputDir

timeout

headers

viewport

debug

launchArgs

Methods

file(input, outputFilePath, [options])

buffer(input, [options])

base64(input, [options])

capture(inputs, [options])

Capture Options

type

timeout

headers

waitFor

waitUntil

fullPage

scripts

styles

viewport

viewportCategory

License

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages