Skip to content

qanet

allennlp_models.rc.models.qanet

[SOURCE]


QaNet#

@Model.register("qanet")
@Model.register("rc-qanet")
class QaNet(Model):
 | def __init__(
 |     self,
 |     vocab: Vocabulary,
 |     text_field_embedder: TextFieldEmbedder,
 |     num_highway_layers: int,
 |     phrase_layer: Seq2SeqEncoder,
 |     matrix_attention_layer: MatrixAttention,
 |     modeling_layer: Seq2SeqEncoder,
 |     dropout_prob: float = 0.1,
 |     initializer: InitializerApplicator = InitializerApplicator(),
 |     regularizer: Optional[RegularizerApplicator] = None
 | ) -> None

This class implements Adams Wei Yu's QANet Model <https://openreview.net/forum?id=B14TlG-RW>_ for machine reading comprehension published at ICLR 2018.

The overall architecture of QANet is very similar to BiDAF. The main difference is that QANet replaces the RNN encoder with CNN + self-attention. There are also some minor differences in the modeling layer and output layer.

Parametersvocab : ``Vocabulary``

text_field_embedder : TextFieldEmbedder Used to embed the question and passage TextFields we get as input to the model. num_highway_layers : int The number of highway layers to use in between embedding the input and passing it through the phrase layer. phrase_layer : Seq2SeqEncoder The encoder (with its own internal stacking) that we will use in between embedding tokens and doing the passage-question attention. matrix_attention_layer : MatrixAttention The matrix attention function that we will use when comparing encoded passage and question representations. modeling_layer : Seq2SeqEncoder The encoder (with its own internal stacking) that we will use in between the bidirectional attention and predicting span start and end. dropout_prob : float, optional (default=0.1) If greater than 0, we will apply dropout with this probability between layers. initializer : InitializerApplicator, optional (default=InitializerApplicator()) Used to initialize the model parameters. regularizer : RegularizerApplicator, optional (default=None) If provided, will be used to calculate the regularization penalty during training.

forward#

class QaNet(Model):
 | ...
 | def forward(
 |     self,
 |     question: Dict[str, torch.LongTensor],
 |     passage: Dict[str, torch.LongTensor],
 |     span_start: torch.IntTensor = None,
 |     span_end: torch.IntTensor = None,
 |     metadata: List[Dict[str, Any]] = None
 | ) -> Dict[str, torch.Tensor]

Parametersquestion : Dict[str, torch.LongTensor]

From a ``TextField``.

passage : Dict[str, torch.LongTensor] From a TextField. The model assumes that this passage contains the answer to the question, and predicts the beginning and ending positions of the answer within the passage. span_start : torch.IntTensor, optional From an IndexField. This is one of the things we are trying to predict - the beginning position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary. span_end : torch.IntTensor, optional From an IndexField. This is one of the things we are trying to predict - the ending position of the answer with the passage. This is an inclusive token index. If this is given, we will compute a loss that gets included in the output dictionary. metadata : List[Dict[str, Any]], optional If present, this should contain the question tokens, passage tokens, original passage text, and token offsets into the passage for each instance in the batch. The length of this list should be the batch size, and each dictionary should have the keys question_tokens, passage_tokens, original_passage, and token_offsets.

ReturnsAn output dictionary consisting of:

span_start_logits : torch.FloatTensor A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span start position. span_start_probs : torch.FloatTensor The result of softmax(span_start_logits). span_end_logits : torch.FloatTensor A tensor of shape (batch_size, passage_length) representing unnormalized log probabilities of the span end position (inclusive). span_end_probs : torch.FloatTensor The result of softmax(span_end_logits). best_span : torch.IntTensor The result of a constrained inference over span_start_logits and span_end_logits to find the most probable span. Shape is (batch_size, 2) and each offset is a token index. loss : torch.FloatTensor, optional A scalar loss to be optimised. best_span_str : List[str] If sufficient metadata was provided for the instances in the batch, we also return the string from the original passage that the model thinks is the best answer to the question.

get_metrics#

class QaNet(Model):
 | ...
 | def get_metrics(self, reset: bool = False) -> Dict[str, float]

default_predictor#

class QaNet(Model):
 | ...
 | default_predictor = "reading_comprehension"