Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

v1.1.0rc1

Pre-release
Pre-release
Compare
Choose a tag to compare
@epwalsh epwalsh released this 14 Jul 21:00
· 173 commits to master since this release

This is the first pre-release candidate for version 1.1. There will probably be at least more candidate before the true 1.1 release.

What's new since v1.0.0

Fixed

  • Reduced the amount of log messages produced by allennlp.common.file_utils.
  • Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable
    in the log output even when train_parameters was set to False.
  • Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances
    in distributed training.
  • Fixed checking equality of ArrayFields.
  • Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().
  • Put more sensible defaults on the huggingface_adamw optimizer.
  • Simplified logging so that all logging output always goes to one file.
  • Fixed interaction with the python command line debugger.
  • Log the grad norm properly even when we're not clipping it.
  • Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor
  • Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.
  • Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
  • Pinned the version of boto3 for package managers (e.g. poetry).
  • Fixed issue #4330 by updating the tokenizers dependency.
  • Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader
    in case it does not have a tokenizer.
  • reg_loss is only now returned for models that have some regularization penalty configured.
  • Fixed a bug that prevented cached_path from downloading assets from GitHub releases.
  • Fixed a bug that erronously increased last label's false positive count in calculating fbeta metrics.
  • Tqdm output now looks much better when the output is being piped or redirected.
  • Small improvements to how the API documentation is rendered.

Added

  • A method to ModelTestCase for running basic model tests when you aren't using config files.
  • Added some convenience methods for reading files.
  • Added an option to file_utils.cached_path to automatically extract archives.
  • Added the ability to pass an archive file instead of a local directory to Vocab.from_files.
  • Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.
  • Added a new "linear_with_warmup" learning rate scheduler.
  • Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual
    distributed sharding itself.
  • Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a
    scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize
    this, just set last_layer_only to False.
  • cached_path() can now read files inside of archives.

Changed

  • Not specifying a cuda_device now automatically determines whether to use a GPU or not.
  • Discovered plugins are logged so you can see what was loaded.
  • allennlp.data.DataLoader is now an abstract registrable class. The default implementation
    remains the same, but was renamed to allennlp.data.PyTorchDataLoader.
  • BertPooler can now unwrap and re-wrap extra dimensions if necessary.
  • New transformers dependency. Only version >=3.0 now supported.

Commits

4eb9795 Prepare for release v1.1.0rc1
f195440 update 'Models' links in README (#4475)
9c801a3 add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472)
69d2f03 Clean up Tqdm bars when output is being piped or redirected (#4470)
7b188c9 fixed bug that erronously increased last label's false positive count (#4473)
64db027 Skip ETag check if OSError (#4469)
b9d011e More BART changes (#4468)
7a563a8 add option to use scalar mix of all transformer layers (#4460)
d00ad66 Minor tqdm and logging clean up (#4448)
6acf205 Fix regloss logging (#4449)
8c32ddf Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456)
b9a9164 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446)
181ef5d pin boto3 to resolve some dependency issues (#4453)
c75a1eb ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454)
8a05ad4 Update CONTRIBUTING.md (#4447)
5b988d6 ensure only rank 0 worker writes to terminal (#4445)
8482f02 fix bug with SlantedTriangular LR scheduler (#4443)
e46a578 Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411)
8229aca Fix pretrained model initialization (#4439)
60deece Fix type hint in text_field.py (#4434)
23e549e More multiple-choice changes (#4415)
6d0a4fd generalize DataLoader (#4416)
acd9995 Automatic file-friendly logging (#4383)
637dbb1 fix README, pin mkdocs, update mkdocs-material (#4412)
9c4dfa5 small fix to pretrained transformer tokenizer (#4417)
84988b8 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414)
54c41fc Adds the ability to automatically detect whether we have a GPU (#4400)
96ff585 Changes from my multiple-choice work (#4368)
eee15ca Assign an empty mapping array to empty fields of NamespaceSwappingField (#4403)
aa2943e Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398)
7fa7531 fix eq method of ArrayField (#4401)
e104e44 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394)
b6fd697 fix sharded dataset reader (#4396)
30e5dbf Bump mypy from 0.781 to 0.782 (#4395)
b0ba2d4 update version
1d07cc7 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389)
ffc5184 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371)
20afe6c Add Optuna integrated badge to README.md (#4361)
ba79f14 Bump mypy from 0.780 to 0.781 (#4390)
85e531c Update README.md (#4385)
c2ecb7a Add a method to ModelTestCase for use without config files (#4381)
6852def pin some doc building requirements (#4386)
bf422d5 Add github template for using your own python run script (#4380)
ebde6e8 Bump overrides from 3.0.0 to 3.1.0 (#4375)
e52b751 ensure transformer params are frozen at initialization when train_parameters is false (#4377)
3e8a9ef Add link to new template repo for config file development (#4372)
4f70bc9 tick version for nightly releases
63a5e15 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370)
ef7c75b reduce amount of log messages produced by file_utils (#4366)