Changelog#

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased#

Fixed evaluation of metrics when using distributed setting.
Fixed a bug introduced in 1.0 where the SRL model did not reproduce the original result.

Added regression tests for training configs that run on a scheduled workflow.
Added a test for the pretrained sentiment analysis model.
Added way for questions from quora dataset to be concatenated like the sequences in the SNLI dataset.

Fixed GraphParser.get_metrics so that it expects a dict from F1Measure.get_metric.
CopyNet and SimpleSeq2Seq models now work with AMP.
Made the SST reader a little more strict in the kinds of input it accepts.

Updated the RoBERTa SST config to make proper use of the CLS token
Updated RoBERTa SNLI and MNLI pretrained models for latest transformers version

Added BART model
Added ModelCard and related classes. Added model cards for all the pretrained models.
Added a field registered_predictor_name to ModelCard.
Added a method load_predictor to allennlp_models.pretrained.
Added support to multi-layer decoder in simple seq2seq model.

Updated the BERT SRL model to be compatible with the new huggingface tokenizers.
CopyNetSeq2Seq model now works with pretrained transformers.
A bug with NextTokenLM that caused simple gradient interpreters to fail.
A bug in training_config of qanet and bimpm that used the old version of regularizer and initializer.
The fine-grained NER transformer model did not survive an upgrade of the transformers library, but it is now fixed.
Fixed many minor formatting issues in docstrings. Docs are now published at https://docs.allennlp.org/models/.

CopyNetDatasetReader no longer automatically adds START_TOKEN and END_TOKEN to the tokenized source. If you want these in the tokenized source, it's up to the source tokenizer.

Added two models for fine-grained NER
Added a category for multiple choice models, including a few reference implementations
Implemented manual distributed sharding for SNLI dataset reader.

No additional note-worthy changes since rc6.

Replaced deepcopy of Instances with new Instance.duplicate() method.
A bug where pretrained sentence taggers would fail to be initialized because some of the models were not imported.
A bug in some RC models that would cause mixed precision training to crash when using NVIDIA apex.
Predictor names were inconsistently switching between dashes and underscores. Now they all use underscores.

Added option to SemanticDependenciesDatasetReader to not skip instances that have no arcs, for validation data
Added a default predictors to several models
Added sentiment analysis models to pretrained.py
Added NLI models to pretrained.py

Made transformer_qa predictor accept JSON input with the keys "question" and "passage" to be consistent with the reading_comprehension predictor.

We first introduced this CHANGELOG after release v1.0.0rc4, so please refer to the GitHub release notes for this and earlier releases.