allennlp.models.semantic_parsing.nlvr¶
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.
NlvrSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout: float = 0.0, rule_namespace: str = 'rule_labels')[source]¶ Bases:
allennlp.models.model.Model
NlvrSemanticParser
is a semantic parsing model built for the NLVR domain. This is an abstract class and does not have aforward
method implemented. Classes that inherit from this class are expected to define their own logic depending on the kind of supervision they use. Accordingly, they should use the appropriateDecoderTrainer
. This class provides some common functionality for things like defining an initialRnnStatelet
, embedding actions, evaluating the denotations of completed logical forms, etc. There is a lot of overlap withWikiTablesSemanticParser
here. We may want to eventually move the common functionality into a more general transition-based parsing class.- Parameters
- vocab
Vocabulary
- sentence_embedder
TextFieldEmbedder
Embedder for sentences.
- action_embedding_dim
int
Dimension to use for action embeddings.
- encoder
Seq2SeqEncoder
The encoder to use for the input question.
- dropout
float
, optional (default=0.0) Dropout on the encoder outputs.
- rule_namespace
str
, optional (default=rule_labels) The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.
- vocab
-
decode
(self, output_dict: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶ This method overrides
Model.decode
, which gets called afterModel.forward
, at test time, to finalize predictions. We only transform the action string sequences into logical forms here.
-
forward
(self)[source]¶ Defines the forward pass of the model. In addition, to facilitate easy training, this method is designed to compute a loss function defined by a user.
The input is comprised of everything required to perform a training update, including labels - you define the signature here! It is down to the user to ensure that inference can be performed without the presence of these labels. Hence, any inputs not available at inference time should only be used inside a conditional block.
The intended sketch of this method is as follows:
def forward(self, input1, input2, targets=None): .... .... output1 = self.layer1(input1) output2 = self.layer2(input2) output_dict = {"output1": output1, "output2": output2} if targets is not None: # Function returning a scalar torch.Tensor, defined by the user. loss = self._compute_loss(output1, output2, targets) output_dict["loss"] = loss return output_dict
- Parameters
- inputs:
Tensors comprising everything needed to perform a training update, including labels, which should be optional (i.e have a default value of
None
). At inference time, simply pass the relevant inputs, not including the labels.
- Returns
- output_dict:
Dict[str, torch.Tensor]
The outputs from the model. In order to train a model using the
Trainer
api, you must provide a “loss” key pointing to a scalartorch.Tensor
representing the loss to be optimized.
- output_dict:
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_coverage_semantic_parser.
NlvrCoverageSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, beam_size: int, max_decoding_steps: int, max_num_finished_states: int = None, dropout: float = 0.0, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, dynamic_cost_weight: Dict[str, Union[int, float]] = None, penalize_non_agenda_actions: bool = False, initial_mml_model_file: str = None)[source]¶ Bases:
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser
NlvrSemanticCoverageParser
is anNlvrSemanticParser
that gets around the problem of lack of annotated logical forms by maximizing coverage of the output sequences over a prespecified agenda. In addition to the signal from coverage, we also compute the denotations given by the logical forms and define a hybrid cost based on coverage and denotation errors. The training process then minimizes the expected value of this cost over an approximate set of logical forms produced by the parser, obtained by performing beam search.- Parameters
- vocab
Vocabulary
Passed to super-class.
- sentence_embedder
TextFieldEmbedder
Passed to super-class.
- action_embedding_dim
int
Passed to super-class.
- encoder
Seq2SeqEncoder
Passed to super-class.
- attention
Attention
We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.
- beam_size
int
Beam size for the beam search used during training.
- max_num_finished_states
int
, optional (default=None) Maximum number of finished states the trainer should compute costs for.
- normalize_beam_score_by_length
bool
, optional (default=False) Should the log probabilities be normalized by length before renormalizing them? Edunov et al. do this in their work, but we found that not doing it works better. It’s possible they did this because their task is NMT, and longer decoded sequences are not necessarily worse, and shouldn’t be penalized, while we will mostly want to penalize longer logical forms.
- max_decoding_steps
int
Maximum number of steps for the beam search during training.
- dropout
float
, optional (default=0.0) Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.
- checklist_cost_weight
float
, optional (default=0.6) Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.
- dynamic_cost_weight
Dict[str, Union[int, float]]
, optional (default=None) A dict containing keys
wait_num_epochs
andrate
indicating the number of steps after which we should start decreasing the weight on checklist cost in favor of denotation cost, and the rate at which we should do it. We will decrease the weight in the following way -checklist_cost_weight = checklist_cost_weight - rate * checklist_cost_weight
starting at the appropriate epoch. The weight will remain constant if this is not provided.- penalize_non_agenda_actions
bool
, optional (default=False) Should we penalize the model for producing terminal actions that are outside the agenda?
- initial_mml_model_file
str
, optional (default=None) If you want to initialize this model using weights from another model trained using MML, pass the path to the
model.tar.gz
file of that model here.
- vocab
-
forward
(self, sentence: Dict[str, torch.LongTensor], worlds: List[List[allennlp.semparse.domain_languages.nlvr_language.NlvrLanguage]], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], agenda: torch.LongTensor, identifier: List[str] = None, labels: torch.LongTensor = None, epoch_num: List[int] = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ Decoder logic for producing type constrained target sequences that maximize coverage of their respective agendas, and minimize a denotation based loss.
-
get_metrics
(self, reset: bool = False) → Dict[str, float][source]¶ Returns a dictionary of metrics. This method will be called by
allennlp.training.Trainer
in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible withMetrics should be populated during the call to ``forward`
, with theMetric
handling the accumulation of the metric until this method is called.
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_direct_semantic_parser.
NlvrDirectSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, decoder_beam_search: allennlp.state_machines.beam_search.BeamSearch, max_decoding_steps: int, dropout: float = 0.0)[source]¶ Bases:
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser
NlvrDirectSemanticParser
is anNlvrSemanticParser
that gets around the problem of lack of logical form annotations by maximizing the marginal likelihood of an approximate set of target sequences that yield the correct denotation. The main difference between this parser andNlvrCoverageSemanticParser
is that while this parser takes the output of an offline search process as the set of target sequences for training, the latter performs search during training.- Parameters
- vocab
Vocabulary
Passed to super-class.
- sentence_embedder
TextFieldEmbedder
Passed to super-class.
- action_embedding_dim
int
Passed to super-class.
- encoder
Seq2SeqEncoder
Passed to super-class.
- attention
Attention
We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.
- decoder_beam_search
BeamSearch
Beam search used to retrieve best sequences after training.
- max_decoding_steps
int
Maximum number of steps for beam search after training.
- dropout
float
, optional (default=0.0) Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.
- vocab
-
forward
(self, sentence: Dict[str, torch.LongTensor], worlds: List[List[allennlp.semparse.domain_languages.nlvr_language.NlvrLanguage]], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], identifier: List[str] = None, target_action_sequences: torch.LongTensor = None, labels: torch.LongTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ Decoder logic for producing type constrained target sequences, trained to maximize marginal likelihod over a set of approximate logical forms.
-
get_metrics
(self, reset: bool = False) → Dict[str, float][source]¶ Returns a dictionary of metrics. This method will be called by
allennlp.training.Trainer
in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible withMetrics should be populated during the call to ``forward`
, with theMetric
handling the accumulation of the metric until this method is called.