Skip to content

Commit

Permalink
Add dataset and model validation.
Browse files Browse the repository at this point in the history
Optionally can validate loaded datasets and models at startup, ensuring correct use of types.

PiperOrigin-RevId: 477779691
  • Loading branch information
jameswex authored and LIT team committed Sep 29, 2022
1 parent 14f82d5 commit 0fef77a
Show file tree
Hide file tree
Showing 12 changed files with 1,042 additions and 22 deletions.
20 changes: 20 additions & 0 deletions documentation/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,26 @@ and [`Model`](#models) classes implement this, and provide metadata (see the
For pre-built `demo.py` examples, check out
/~https://github.com/PAIR-code/lit/tree/main/lit_nlp/examples

### Validating Models and Data

Datasets and models can optionally be validated by LIT to ensure that dataset
examples match their spec and that model output values match their spec.
This can be very helpful during development of new model and dataset wrappers
to ensure correct behavior in LIT.

At LIT server startup, the `validate` runtime flag can be used to enable
validation.
Setting the flag to `first` will validate the first example in each dataset for
correctly typed values and validate it with each model it is compatible with, to
ensure that the model outputs are also correctly typed. Setting it to `sample`
will validate against a sample of 5% of each dataset. Setting it to `all` will
validate all examples in all datasets. By default, no validation is performed,
to enable quick startup.

Additionally, if using LIT datasets and models outside of the LIT server,
validation can be called directly through the
[`validation`](../lit_nlp/lib/validation.py) module.

## Datasets

Datasets ([`Dataset`](../lit_nlp/api/dataset.py)) are
Expand Down
Loading

0 comments on commit 0fef77a

Please sign in to comment.