Skip to content
Jeffrey Heer edited this page Apr 24, 2015 · 20 revisions

datalib

Datalib is a JavaScript data utility library. It provides facilities for data loading, type inference, common statistics, and string templates. While created to power Vega and related projects, datalib is a standalone library useful for data-driven JavaScript applications on both the client (web browser) and server (e.g., node.js).

For documentation, see the datalib API Reference.

Use

Datalib provides a set of utilities for working with data. These include:

  • Loading and parsing data files (e.g., JSON, TopoJSON, CSV, TSV)
  • Summary statistics (e.g., mean, stdev, median, mode skewness, etc)
  • Data-driven string templates, including a set of expressive filters
  • Utilities for working with JavaScript objects and arrays

Datalib can be used both server-side and client-side. For use in node.js, simply npm install datalib or include datalib as a dependency in your package.json file. For use on the client, datalib is bundled into a single minified JS file using browserify.

Example

// Load datalib.
var dl = require('datalib');

// Load and parse a CSV file. Datalib does type inference for you.
// The result is an array of JavaScript objects with named values.
// Parsed dates are stored as UNIX timestamp values.
var data = dl.csv('http://trifacta.github.io/vega/data/stocks.csv');

// Show summary statistics for each column of the data table.
console.log(dl.summary(data).toString());

// Compute correlation measures between price and date.
var price = dl.accessor('price');
var date = dl.accessor('date');
console.log(
  dl.cor(data, price, date), // Pearson product-moment correlation
  dl.dcor(csv, price, date)  // Distance correlation
);
Clone this wiki locally