allennlp.models.reading_comprehension¶

Reading comprehension is loosely defined as follows: given a question and a passage of text that contains the answer, answer the question.

These submodules contain models for things that are predominantly focused on reading comprehension.

class allennlp.models.reading_comprehension.bidaf.BidirectionalAttentionFlow(vocab: allennlp.data.vocabulary.Vocabulary, text_field_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, num_highway_layers: int, phrase_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, similarity_function: allennlp.modules.similarity_functions.similarity_function.SimilarityFunction, modeling_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, span_end_encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout: float = 0.2, mask_lstms: bool = True, initializer: allennlp.nn.initializers.InitializerApplicator = <allennlp.nn.initializers.InitializerApplicator object>, regularizer: Optional[allennlp.nn.regularizers.regularizer_applicator.RegularizerApplicator] = None)[source]¶

Bases: allennlp.models.model.Model

This class implements Minjoon Seo’s Bidirectional Attention Flow model for answering reading comprehension questions (ICLR 2017).

The basic layout is pretty simple: encode words as a combination of word embeddings and a character-level encoder, pass the word representations through a bi-LSTM/GRU, use a matrix of attentions to put question information into the passage word representations (this is the only part that is at all non-standard), pass this through another few layers of bi-LSTMs/GRUs, and do a softmax over span start and span end.

Parameters

vocabVocabulary
text_field_embedderTextFieldEmbedder: Used to embed the question and passage TextFields we get as input to the model.
num_highway_layersint: The number of highway layers to use in between embedding the input and passing it through the phrase layer.
phrase_layerSeq2SeqEncoder: The encoder (with its own internal stacking) that we will use in between embedding tokens and doing the bidirectional attention.
similarity_functionSimilarityFunction: The similarity function that we will use when comparing encoded passage and question representations.
modeling_layerSeq2SeqEncoder: The encoder (with its own internal stacking) that we will use in between the bidirectional attention and predicting span start and end.
span_end_encoderSeq2SeqEncoder: The encoder that we will use to incorporate span start predictions into the passage state before predicting span end.
dropoutfloat, optional (default=0.2): If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer).
mask_lstmsbool, optional (default=True): If False, we will skip passing the mask to the LSTM layers. This gives a ~2x speedup, with only a slight performance decrease, if any. We haven’t experimented much with this yet, but have confirmed that we still get very similar performance with much faster training times. We still use the mask for all softmaxes, but avoid the shuffling that’s required when using masking with pytorch LSTMs.
initializerInitializerApplicator, optional (default=``InitializerApplicator()``): Used to initialize the model parameters.
regularizerRegularizerApplicator, optional (default=``None``): If provided, will be used to calculate the regularization penalty during training.

forward(self, question: Dict[str, torch.LongTensor], passage: Dict[str, torch.LongTensor], span_start: torch.IntTensor = None, span_end: torch.IntTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

Parameters

questionDict[str, torch.LongTensor]: From a TextField.
passageDict[str, torch.LongTensor]: From a TextField. The model assumes that this passage contains the answer to the question, and predicts the beginning and ending positions of the answer within the passage.
span_starttorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the beginning position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
span_endtorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the ending position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
metadataList[Dict[str, Any]], optional: metadata : List[Dict[str, Any]], optional If present, this should contain the question tokens, passage tokens, original passage text, and token offsets into the passage for each instance in the batch. The length of this list should be the batch size, and each dictionary should have the keys question_tokens, passage_tokens, original_passage, and token_offsets.

Returns

An output dictionary consisting of:
span_start_logitstorch.FloatTensor: A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span start position.
span_start_probstorch.FloatTensor: The result of softmax(span_start_logits).
span_end_logitstorch.FloatTensor: A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span end position (inclusive).
span_end_probstorch.FloatTensor: The result of softmax(span_end_logits).
best_spantorch.IntTensor: The result of a constrained inference over span_start_logits and span_end_logits to find the most probable span. Shape is (batch_size, 2) and each offset is a token index.
losstorch.FloatTensor, optional: A scalar loss to be optimised.
best_span_strList[str]: If sufficient metadata was provided for the instances in the batch, we also return the string from the original passage that the model thinks is the best answer to the question.

static get_best_span(span_start_logits: torch.Tensor, span_end_logits: torch.Tensor) → torch.Tensor[source]¶

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

class allennlp.models.reading_comprehension.bidaf_ensemble.BidafEnsemble(submodels: List[allennlp.models.reading_comprehension.bidaf.BidirectionalAttentionFlow])[source]¶

Bases: allennlp.models.ensemble.Ensemble

This class ensembles the output from multiple BiDAF models.

It combines results from the submodels by averaging the start and end span probabilities.

forward(self, question: Dict[str, torch.LongTensor], passage: Dict[str, torch.LongTensor], span_start: torch.IntTensor = None, span_end: torch.IntTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

The forward method runs each of the submodels, then selects the best span from the subresults. The best span is determined by averaging the probabilities for the start and end of the spans.

Parameters

questionDict[str, torch.LongTensor]: From a TextField.
passageDict[str, torch.LongTensor]: From a TextField. The model assumes that this passage contains the answer to the question, and predicts the beginning and ending positions of the answer within the passage.
span_starttorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the beginning position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
span_endtorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the ending position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
metadataList[Dict[str, Any]], optional: If present, this should contain the question ID, original passage text, and token offsets into the passage for each instance in the batch. We use this for computing official metrics using the official SQuAD evaluation script. The length of this list should be the batch size, and each dictionary should have the keys id, original_passage, and token_offsets. If you only want the best span string and don’t care about official metrics, you can omit the id key.

Returns

An output dictionary consisting of:
best_spantorch.IntTensor: The result of a constrained inference over span_start_logits and span_end_logits to find the most probable span. Shape is (batch_size, 2) and each offset is a token index.
best_span_strList[str]: If sufficient metadata was provided for the instances in the batch, we also return the string from the original passage that the model thinks is the best answer to the question.

classmethod from_params(vocab: allennlp.data.vocabulary.Vocabulary, params: allennlp.common.params.Params) → 'BidafEnsemble'[source]¶

This is the automatic implementation of from_params. Any class that subclasses FromParams (or Registrable, which itself subclasses FromParams) gets this implementation for free. If you want your class to be instantiated from params in the “obvious” way – pop off parameters and hand them to your constructor with the same names – this provides that functionality.

If you need more complex logic in your from from_params method, you’ll have to implement your own method that overrides this one.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

allennlp.models.reading_comprehension.bidaf_ensemble.ensemble(subresults: List[Dict[str, torch.Tensor]]) → torch.Tensor[source]¶

Identifies the best prediction given the results from the submodels.

Parameters

subresultsList[Dict[str, torch.Tensor]]: Results of each submodel.

Returns

The index of the best submodel.

class allennlp.models.reading_comprehension.dialog_qa.DialogQA(vocab: allennlp.data.vocabulary.Vocabulary, text_field_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, phrase_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, residual_encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, span_start_encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, span_end_encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, initializer: allennlp.nn.initializers.InitializerApplicator, dropout: float = 0.2, num_context_answers: int = 0, marker_embedding_dim: int = 10, max_span_length: int = 30, max_turn_length: int = 12)[source]¶

Bases: allennlp.models.model.Model

This class implements modified version of BiDAF (with self attention and residual layer, from Clark and Gardner ACL 17 paper) model as used in Question Answering in Context (EMNLP 2018) paper [https://arxiv.org/pdf/1808.07036.pdf].

In this set-up, a single instance is a dialog, list of question answer pairs.

Parameters

vocabVocabulary
text_field_embedderTextFieldEmbedder: Used to embed the question and passage TextFields we get as input to the model.
phrase_layerSeq2SeqEncoder: The encoder (with its own internal stacking) that we will use in between embedding tokens and doing the bidirectional attention.
span_start_encoderSeq2SeqEncoder: The encoder that we will use to incorporate span start predictions into the passage state before predicting span end.
span_end_encoderSeq2SeqEncoder: The encoder that we will use to incorporate span end predictions into the passage state.
dropoutfloat, optional (default=0.2): If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer).
num_context_answersint, optional (default=0): If greater than 0, the model will consider previous question answering context.
max_span_length: ``int``, optional (default=0): Maximum token length of the output span.
max_turn_length: ``int``, optional (default=12): Maximum length of an interaction.

decode(self, output_dict: Dict[str, torch.Tensor]) → Dict[str, Any][source]¶

Takes the result of forward() and runs inference / decoding / whatever post-processing you need to do your model. The intent is that model.forward() should produce potentials or probabilities, and then model.decode() can take those results and run some kind of beam search or constrained inference or whatever is necessary. This does not handle all possible decoding use cases, but it at least handles simple kinds of decoding.

This method modifies the input dictionary, and also returns the same dictionary.

By default in the base class we do nothing. If your model has some special decoding step, override this method.

forward(self, question: Dict[str, torch.LongTensor], passage: Dict[str, torch.LongTensor], span_start: torch.IntTensor = None, span_end: torch.IntTensor = None, p1_answer_marker: torch.IntTensor = None, p2_answer_marker: torch.IntTensor = None, p3_answer_marker: torch.IntTensor = None, yesno_list: torch.IntTensor = None, followup_list: torch.IntTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

Parameters

questionDict[str, torch.LongTensor]: From a TextField.
passageDict[str, torch.LongTensor]: From a TextField. The model assumes that this passage contains the answer to the question, and predicts the beginning and ending positions of the answer within the passage.
span_starttorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the beginning position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
span_endtorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the ending position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
p1_answer_markertorch.IntTensor, optional: This is one of the inputs, but only when num_context_answers > 0. This is a tensor that has a shape [batch_size, max_qa_count, max_passage_length]. Most passage token will have assigned ‘O’, except the passage tokens belongs to the previous answer in the dialog, which will be assigned labels such as <1_start>, <1_in>, <1_end>. For more details, look into dataset_readers/util/make_reading_comprehension_instance_quac
p2_answer_markertorch.IntTensor, optional: This is one of the inputs, but only when num_context_answers > 1. It is similar to p1_answer_marker, but marking previous previous answer in passage.
p3_answer_markertorch.IntTensor, optional: This is one of the inputs, but only when num_context_answers > 2. It is similar to p1_answer_marker, but marking previous previous previous answer in passage.
yesno_listtorch.IntTensor, optional: This is one of the outputs that we are trying to predict. Three way classification (the yes/no/not a yes no question).
followup_listtorch.IntTensor, optional: This is one of the outputs that we are trying to predict. Three way classification (followup / maybe followup / don’t followup).
metadataList[Dict[str, Any]], optional: If present, this should contain the question ID, original passage text, and token offsets into the passage for each instance in the batch. We use this for computing official metrics using the official SQuAD evaluation script. The length of this list should be the batch size, and each dictionary should have the keys id, original_passage, and token_offsets. If you only want the best span string and don’t care about official metrics, you can omit the id key.

Returns

An output dictionary consisting of the followings.
Each of the followings is a nested list because first iterates over dialog, then questions in dialog.
qidList[List[str]]: A list of list, consisting of question ids.
followupList[List[int]]: A list of list, consisting of continuation marker prediction index. (y :yes, m: maybe follow up, n: don’t follow up)
yesnoList[List[int]]: A list of list, consisting of affirmation marker prediction index. (y :yes, x: not a yes/no question, n: np)
best_span_strList[List[str]]: If sufficient metadata was provided for the instances in the batch, we also return the string from the original passage that the model thinks is the best answer to the question.
losstorch.FloatTensor, optional: A scalar loss to be optimised.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

class allennlp.models.reading_comprehension.qanet.QaNet(vocab: allennlp.data.vocabulary.Vocabulary, text_field_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, num_highway_layers: int, phrase_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, matrix_attention_layer: allennlp.modules.matrix_attention.matrix_attention.MatrixAttention, modeling_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout_prob: float = 0.1, initializer: allennlp.nn.initializers.InitializerApplicator = <allennlp.nn.initializers.InitializerApplicator object>, regularizer: Optional[allennlp.nn.regularizers.regularizer_applicator.RegularizerApplicator] = None)[source]¶

Bases: allennlp.models.model.Model

This class implements Adams Wei Yu’s QANet Model for machine reading comprehension published at ICLR 2018.

The overall architecture of QANet is very similar to BiDAF. The main difference is that QANet replaces the RNN encoder with CNN + self-attention. There are also some minor differences in the modeling layer and output layer.

Parameters

vocabVocabulary
text_field_embedderTextFieldEmbedder: Used to embed the question and passage TextFields we get as input to the model.
num_highway_layersint: The number of highway layers to use in between embedding the input and passing it through the phrase layer.
phrase_layerSeq2SeqEncoder: The encoder (with its own internal stacking) that we will use in between embedding tokens and doing the passage-question attention.
matrix_attention_layerMatrixAttention: The matrix attention function that we will use when comparing encoded passage and question representations.
modeling_layerSeq2SeqEncoder: The encoder (with its own internal stacking) that we will use in between the bidirectional attention and predicting span start and end.
dropout_probfloat, optional (default=0.1): If greater than 0, we will apply dropout with this probability between layers.
initializerInitializerApplicator, optional (default=``InitializerApplicator()``): Used to initialize the model parameters.
regularizerRegularizerApplicator, optional (default=``None``): If provided, will be used to calculate the regularization penalty during training.

forward(self, question: Dict[str, torch.LongTensor], passage: Dict[str, torch.LongTensor], span_start: torch.IntTensor = None, span_end: torch.IntTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

Parameters

questionDict[str, torch.LongTensor]: From a TextField.
passageDict[str, torch.LongTensor]: From a TextField. The model assumes that this passage contains the answer to the question, and predicts the beginning and ending positions of the answer within the passage.
span_starttorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the beginning position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
span_endtorch.IntTensor, optional: From an IndexField. This is one of the things we are trying to predict - the ending position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary.
metadataList[Dict[str, Any]], optional: If present, this should contain the question tokens, passage tokens, original passage text, and token offsets into the passage for each instance in the batch. The length of this list should be the batch size, and each dictionary should have the keys question_tokens, passage_tokens, original_passage, and token_offsets.

Returns

An output dictionary consisting of:
span_start_logitstorch.FloatTensor: A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span start position.
span_start_probstorch.FloatTensor: The result of softmax(span_start_logits).
span_end_logitstorch.FloatTensor: A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span end position (inclusive).
span_end_probstorch.FloatTensor: The result of softmax(span_end_logits).
best_spantorch.IntTensor: The result of a constrained inference over span_start_logits and span_end_logits to find the most probable span. Shape is (batch_size, 2) and each offset is a token index.
losstorch.FloatTensor, optional: A scalar loss to be optimised.
best_span_strList[str]: If sufficient metadata was provided for the instances in the batch, we also return the string from the original passage that the model thinks is the best answer to the question.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

class allennlp.models.reading_comprehension.naqanet.NumericallyAugmentedQaNet(vocab: allennlp.data.vocabulary.Vocabulary, text_field_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, num_highway_layers: int, phrase_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, matrix_attention_layer: allennlp.modules.matrix_attention.matrix_attention.MatrixAttention, modeling_layer: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout_prob: float = 0.1, initializer: allennlp.nn.initializers.InitializerApplicator = <allennlp.nn.initializers.InitializerApplicator object>, regularizer: Optional[allennlp.nn.regularizers.regularizer_applicator.RegularizerApplicator] = None, answering_abilities: List[str] = None)[source]¶

Bases: allennlp.models.model.Model

This class augments the QANet model with some rudimentary numerical reasoning abilities, as published in the original DROP paper.

The main idea here is that instead of just predicting a passage span after doing all of the QANet modeling stuff, we add several different “answer abilities”: predicting a span from the question, predicting a count, or predicting an arithmetic expression. Near the end of the QANet model, we have a variable that predicts what kind of answer type we need, and each branch has separate modeling logic to predict that answer type. We then marginalize over all possible ways of getting to the right answer through each of these answer types.

forward(self, question: Dict[str, torch.LongTensor], passage: Dict[str, torch.LongTensor], number_indices: torch.LongTensor, answer_as_passage_spans: torch.LongTensor = None, answer_as_question_spans: torch.LongTensor = None, answer_as_add_sub_expressions: torch.LongTensor = None, answer_as_counts: torch.LongTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

Defines the forward pass of the model. In addition, to facilitate easy training, this method is designed to compute a loss function defined by a user.

The input is comprised of everything required to perform a training update, including labels - you define the signature here! It is down to the user to ensure that inference can be performed without the presence of these labels. Hence, any inputs not available at inference time should only be used inside a conditional block.

The intended sketch of this method is as follows:

def forward(self, input1, input2, targets=None):
    ....
    ....
    output1 = self.layer1(input1)
    output2 = self.layer2(input2)
    output_dict = {"output1": output1, "output2": output2}
    if targets is not None:
        # Function returning a scalar torch.Tensor, defined by the user.
        loss = self._compute_loss(output1, output2, targets)
        output_dict["loss"] = loss
    return output_dict

Parameters

inputs:: Tensors comprising everything needed to perform a training update, including labels, which should be optional (i.e have a default value of None). At inference time, simply pass the relevant inputs, not including the labels.

Returns

output_dict: Dict[str, torch.Tensor]: The outputs from the model. In order to train a model using the Trainer api, you must provide a “loss” key pointing to a scalar torch.Tensor representing the loss to be optimized.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

allennlp.models.reading_comprehension.util.get_best_span(span_start_logits: torch.Tensor, span_end_logits: torch.Tensor) → torch.Tensor[source]¶

This acts the same as the static method BidirectionalAttentionFlow.get_best_span() in allennlp/models/reading_comprehension/bidaf.py. We keep it here so that users can directly import this function without the class.

We call the inputs “logits” - they could either be unnormalized logits or normalized log probabilities. A log_softmax operation is a constant shifting of the entire logit vector, so taking an argmax over either one gives the same result.