-
Notifications
You must be signed in to change notification settings - Fork 132
Home
NOTE: Datalib is no longer being actively maintained. The Arquero library provides similar functionality plus much more. In addition, Vega now includes its own data utilities in the vega-util and vega-statistics packages.
Datalib is a JavaScript data utility library. It provides facilities for data loading, type inference, common statistics, and string formatting. While created to power Vega and related projects, datalib is a standalone library useful for data-driven JavaScript applications on both the client (web browser) and server (e.g., node.js).
For documentation, see the datalib API Reference.
Datalib provides a set of utilities for working with data. These include:
- Loading and parsing data files (JSON, TopoJSON, CSV, TSV).
- Summary statistics (mean, deviation, median, correlation, histograms, etc).
- Group-by aggregation queries, including streaming data support.
- Data-driven string templates with expressive formatting filters.
- Utilities for working with JavaScript functions, objects and arrays.
Datalib can be used both server-side and client-side. For use in node.js,
simply npm install datalib
or include datalib as a dependency in your package.json file. For use on the client, install via bower install datalib
or include datalib.min.js in your page.
// Load datalib.
var dl = require('datalib');
// Load and parse a CSV file. Datalib does type inference for you.
// The result is an array of JavaScript objects with named values.
// Parsed dates are stored as UNIX timestamp values.
var data = dl.csv('http://vega.github.io/datalib/data/stocks.csv');
// Show summary statistics for each column of the data table.
console.log(dl.format.summary(data));
// Compute mean and standard deviation by ticker symbol.
var rollup = dl.groupby('symbol')
.summarize({'price': ['mean', 'stdev']})
.execute(data);
console.log(dl.print.table(rollup));
// Compute correlation measures between price and date.
console.log(
dl.cor(data, 'price', 'date'), // Pearson product-moment correlation
dl.cor.rank(data, 'price', 'date'), // Spearman rank correlation
dl.cor.dist(data, 'price', 'date') // Distance correlation
);
// Compute mutual information distance between years and binned price.
var bin_price = dl.$bin(data, 'price'); // returns binned price values
var year_date = dl.$year('date'); // returns year from date field
var counts = dl.groupby(year_date, bin_price).count().execute(data);
console.log(dl.mutual.dist(counts, 'bin_price', 'year_date', 'count'));