allennlp.commands.train#

The train subcommand can be used to train a model. It requires a configuration file and a directory in which to write the results.

$ allennlp train --help usage: allennlp train [-h] -s SERIALIZATION_DIR [-r] [-f] [-o OVERRIDES] [--file-friendly-logging] [--node-rank NODE_RANK] [--dry-run] [--include-package INCLUDE_PACKAGE] param_path

Train the specified model on the specified dataset.

positional arguments:
  param_path            path to parameter file describing the model to be
                        trained

optional arguments:
  -h, --help            show this help message and exit
  -s SERIALIZATION_DIR, --serialization-dir SERIALIZATION_DIR
                        directory in which to save the model and its logs
  -r, --recover         recover training from the state in serialization_dir
  -f, --force           overwrite the output directory if it exists
  -o OVERRIDES, --overrides OVERRIDES
                        a JSON structure used to override the experiment
                        configuration
  --file-friendly-logging
                        outputs tqdm status on separate lines and slows tqdm
                        refresh rate
  --node-rank NODE_RANK
                        rank of this node in the distributed setup (default =
                        0)
  --dry-run             do not train a model, but create a vocabulary, show
                        dataset statistics and other training information
  --include-package INCLUDE_PACKAGE
                        additional packages to include

train_model#

train_model(
    params: allennlp.common.params.Params,
    serialization_dir: str,
    file_friendly_logging: bool = False,
    recover: bool = False,
    force: bool = False,
    node_rank: int = 0,
    include_package: List[str] = None,
    batch_weight_key: str = '',
    dry_run: bool = False,
) -> Optional[allennlp.models.model.Model]

Trains the model specified in the given :class:Params object, using the data and training parameters also specified in that object, and saves the results in serialization_dir.

Parameters

  • params : Params A parameter object specifying an AllenNLP Experiment.
  • serialization_dir : str The directory in which to save results and logs.
  • file_friendly_logging : bool, optional (default=False) If True, we add newlines to tqdm output, even on an interactive terminal, and we slow down tqdm's output to only once every 10 seconds.
  • recover : bool, optional (default=False) If True, we will try to recover a training run from an existing serialization directory. This is only intended for use when something actually crashed during the middle of a run. For continuing training a model on new data, see Model.from_archive.
  • force : bool, optional (default=False) If True, we will overwrite the serialization directory if it already exists.
  • node_rank : int, optional Rank of the current node in distributed training
  • include_package : List[str], optional In distributed mode, extra packages mentioned will be imported in trainer workers.
  • batch_weight_key : str, optional (default="") If non-empty, name of metric used to weight the loss on a per-batch basis.
  • dry_run : bool, optional (default=False) Do not train a model, but create a vocabulary, show dataset statistics and other training information.

Returns

best_model: Optional[Model] The model with the best epoch weights or None if in dry run.

train_model_from_args#

train_model_from_args(args:argparse.Namespace)

Just converts from an argparse.Namespace object to string paths.

train_model_from_file#

train_model_from_file(
    parameter_filename: str,
    serialization_dir: str,
    overrides: str = '',
    file_friendly_logging: bool = False,
    recover: bool = False,
    force: bool = False,
    node_rank: int = 0,
    include_package: List[str] = None,
    dry_run: bool = False,
) -> Optional[allennlp.models.model.Model]

A wrapper around :func:train_model which loads the params from a file.

Parameters

  • parameter_filename : str A json parameter file specifying an AllenNLP experiment.
  • serialization_dir : str The directory in which to save results and logs. We just pass this along to
  • :func:train_model.
  • overrides : str A JSON string that we will use to override values in the input parameter file.
  • file_friendly_logging : bool, optional (default=False) If True, we make our output more friendly to saved model files. We just pass this
  • along to :func:train_model.
  • recover : bool, optional (default=False) If True, we will try to recover a training run from an existing serialization directory. This is only intended for use when something actually crashed during the middle of a run. For continuing training a model on new data, see Model.from_archive.
  • force : bool, optional (default=False) If True, we will overwrite the serialization directory if it already exists.
  • node_rank : int, optional Rank of the current node in distributed training
  • include_package : str, optional In distributed mode, extra packages mentioned will be imported in trainer workers.
  • dry_run : bool, optional (default=False) Do not train a model, but create a vocabulary, show dataset statistics and other training information.

Returns

best_model: Optional[Model] The model with the best epoch weights or None if in dry run.

TrainModel#

TrainModel(
    self,
    serialization_dir: str,
    model: allennlp.models.model.Model,
    trainer: allennlp.training.trainer.Trainer,
    evaluation_data_loader: allennlp.data.dataloader.DataLoader = None,
    evaluate_on_test: bool = False,
    batch_weight_key: str = '',
) -> None

This class exists so that we can easily read a configuration file with the allennlp train command. The basic logic is that we call train_loop = TrainModel.from_params(params_from_config_file), then train_loop.run(). This class performs very little logic, pushing most of it to the Trainer that has a train() method. The point here is to construct all of the dependencies for the Trainer in a way that we can do it using from_params(), while having all of those dependencies transparently documented and not hidden in calls to params.pop(). If you are writing your own training loop, you almost certainly should not use this class, but you might look at the code for this class to see what we do, to make writing your training loop easier.

In particular, if you are tempted to call the __init__ method of this class, you are probably doing something unnecessary. Literally all we do after __init__ is call trainer.train(). You can do that yourself, if you've constructed a Trainer already. What this class gives you is a way to construct the Trainer by means of a config file. The actual constructor that we use with from_params in this class is from_partial_objects. See that method for a description of all of the allowed top-level keys in a configuration file used with allennlp train.

default_implementation#

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

from_partial_objects#

TrainModel.from_partial_objects(
    serialization_dir: str,
    local_rank: int,
    batch_weight_key: str,
    dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader,
    train_data_path: str,
    model: allennlp.common.lazy.Lazy,
    data_loader: allennlp.common.lazy.Lazy,
    trainer: allennlp.common.lazy.Lazy,
    vocabulary: allennlp.common.lazy.Lazy = None,
    datasets_for_vocab_creation: List[str] = None,
    validation_dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader = None,
    validation_data_path: str = None,
    validation_data_loader: allennlp.common.lazy.Lazy = None,
    test_data_path: str = None,
    evaluate_on_test: bool = False,
) -> 'TrainModel'

This method is intended for use with our FromParams logic, to construct a TrainModel object from a config file passed to the allennlp train command. The arguments to this method are the allowed top-level keys in a configuration file (except for the first three, which are obtained separately).

You could use this outside of our FromParams logic if you really want to, but there might be easier ways to accomplish your goal than instantiating Lazy objects. If you are writing your own training loop, we recommend that you look at the implementation of this method for inspiration and possibly some utility functions you can call, but you very likely should not use this method directly.

The Lazy type annotations here are a mechanism for building dependencies to an object sequentially - the TrainModel object needs data, a model, and a trainer, but the model needs to see the data before it's constructed (to create a vocabulary) and the trainer needs the data and the model before it's constructed. Objects that have sequential dependencies like this are labeled as Lazy in their type annotations, and we pass the missing dependencies when we call their construct() method, which you can see in the code below.

Parameters

  • serialization_dir: str The directory where logs and model archives will be saved.
  • local_rank: int The process index that is initialized using the GPU device id.
  • batch_weight_key: str The name of metric used to weight the loss on a per-batch basis.
  • dataset_reader: DatasetReader The DatasetReader that will be used for training and (by default) for validation.
  • train_data_path: str The file (or directory) that will be passed to dataset_reader.read() to construct the training data.
  • model: Lazy[Model] The model that we will train. This is lazy because it depends on the Vocabulary; after constructing the vocabulary we call model.construct(vocab=vocabulary).
  • data_loader: Lazy[DataLoader] The data_loader we use to batch instances from the dataset reader at training and (by default) validation time. This is lazy because it takes a dataset in it's constructor.
  • trainer: Lazy[Trainer] The Trainer that actually implements the training loop. This is a lazy object because it depends on the model that's going to be trained.
  • vocabulary: Lazy[Vocabulary], optional (default=None) The Vocabulary that we will use to convert strings in the data to integer ids (and possibly set sizes of embedding matrices in the Model). By default we construct the vocabulary from the instances that we read.
  • datasets_for_vocab_creation: List[str], optional (default=None) If you pass in more than one dataset but don't want to use all of them to construct a vocabulary, you can pass in this key to limit it. Valid entries in the list are "train", "validation" and "test".
  • validation_dataset_reader: DatasetReader, optional (default=None) If given, we will use this dataset reader for the validation data instead of dataset_reader.
  • validation_data_path: str, optional (default=None) If given, we will use this data for computing validation metrics and early stopping.
  • validation_data_loader: Lazy[DataLoader], optional (default=None) If given, the data_loader we use to batch instances from the dataset reader at validation and test time. This is lazy because it takes a dataset in it's constructor.
  • test_data_path: str, optional (default=None) If given, we will use this as test data. This makes it available for vocab creation by default, but nothing else.
  • evaluate_on_test: bool, optional (default=False) If given, we will evaluate the final model on this data at the end of training. Note that we do not recommend using this for actual test data in every-day experimentation; you should only very rarely evaluate your model on actual test data.