allennlp.models.semantic_parsing.nlvr¶
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout: float = 0.0, rule_namespace: str = 'rule_labels')[source]¶ Bases:
allennlp.models.model.ModelNlvrSemanticParseris a semantic parsing model built for the NLVR domain. This is an abstract class and does not have aforwardmethod implemented. Classes that inherit from this class are expected to define their own logic depending on the kind of supervision they use. Accordingly, they should use the appropriateDecoderTrainer. This class provides some common functionality for things like defining an initialRnnStatelet, embedding actions, evaluating the denotations of completed logical forms, etc. There is a lot of overlap withWikiTablesSemanticParserhere. We may want to eventually move the common functionality into a more general transition-based parsing class.- Parameters
- vocab
Vocabulary - sentence_embedder
TextFieldEmbedder Embedder for sentences.
- action_embedding_dim
int Dimension to use for action embeddings.
- encoder
Seq2SeqEncoder The encoder to use for the input question.
- dropout
float, optional (default=0.0) Dropout on the encoder outputs.
- rule_namespace
str, optional (default=rule_labels) The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.
- vocab
-
decode(self, output_dict: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶ This method overrides
Model.decode, which gets called afterModel.forward, at test time, to finalize predictions. We only transform the action string sequences into logical forms here.
-
forward(self)[source]¶ Defines the forward pass of the model. In addition, to facilitate easy training, this method is designed to compute a loss function defined by a user.
The input is comprised of everything required to perform a training update, including labels - you define the signature here! It is down to the user to ensure that inference can be performed without the presence of these labels. Hence, any inputs not available at inference time should only be used inside a conditional block.
The intended sketch of this method is as follows:
def forward(self, input1, input2, targets=None): .... .... output1 = self.layer1(input1) output2 = self.layer2(input2) output_dict = {"output1": output1, "output2": output2} if targets is not None: # Function returning a scalar torch.Tensor, defined by the user. loss = self._compute_loss(output1, output2, targets) output_dict["loss"] = loss return output_dict
- Parameters
- inputs:
Tensors comprising everything needed to perform a training update, including labels, which should be optional (i.e have a default value of
None). At inference time, simply pass the relevant inputs, not including the labels.
- Returns
- output_dict:
Dict[str, torch.Tensor] The outputs from the model. In order to train a model using the
Trainerapi, you must provide a “loss” key pointing to a scalartorch.Tensorrepresenting the loss to be optimized.
- output_dict:
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_coverage_semantic_parser.NlvrCoverageSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, beam_size: int, max_decoding_steps: int, max_num_finished_states: int = None, dropout: float = 0.0, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, dynamic_cost_weight: Dict[str, Union[int, float]] = None, penalize_non_agenda_actions: bool = False, initial_mml_model_file: str = None)[source]¶ Bases:
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParserNlvrSemanticCoverageParseris anNlvrSemanticParserthat gets around the problem of lack of annotated logical forms by maximizing coverage of the output sequences over a prespecified agenda. In addition to the signal from coverage, we also compute the denotations given by the logical forms and define a hybrid cost based on coverage and denotation errors. The training process then minimizes the expected value of this cost over an approximate set of logical forms produced by the parser, obtained by performing beam search.- Parameters
- vocab
Vocabulary Passed to super-class.
- sentence_embedder
TextFieldEmbedder Passed to super-class.
- action_embedding_dim
int Passed to super-class.
- encoder
Seq2SeqEncoder Passed to super-class.
- attention
Attention We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.
- beam_size
int Beam size for the beam search used during training.
- max_num_finished_states
int, optional (default=None) Maximum number of finished states the trainer should compute costs for.
- normalize_beam_score_by_length
bool, optional (default=False) Should the log probabilities be normalized by length before renormalizing them? Edunov et al. do this in their work, but we found that not doing it works better. It’s possible they did this because their task is NMT, and longer decoded sequences are not necessarily worse, and shouldn’t be penalized, while we will mostly want to penalize longer logical forms.
- max_decoding_steps
int Maximum number of steps for the beam search during training.
- dropout
float, optional (default=0.0) Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.
- checklist_cost_weight
float, optional (default=0.6) Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.
- dynamic_cost_weight
Dict[str, Union[int, float]], optional (default=None) A dict containing keys
wait_num_epochsandrateindicating the number of steps after which we should start decreasing the weight on checklist cost in favor of denotation cost, and the rate at which we should do it. We will decrease the weight in the following way -checklist_cost_weight = checklist_cost_weight - rate * checklist_cost_weightstarting at the appropriate epoch. The weight will remain constant if this is not provided.- penalize_non_agenda_actions
bool, optional (default=False) Should we penalize the model for producing terminal actions that are outside the agenda?
- initial_mml_model_file
str, optional (default=None) If you want to initialize this model using weights from another model trained using MML, pass the path to the
model.tar.gzfile of that model here.
- vocab
-
forward(self, sentence: Dict[str, torch.LongTensor], worlds: List[List[allennlp.semparse.domain_languages.nlvr_language.NlvrLanguage]], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], agenda: torch.LongTensor, identifier: List[str] = None, labels: torch.LongTensor = None, epoch_num: List[int] = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ Decoder logic for producing type constrained target sequences that maximize coverage of their respective agendas, and minimize a denotation based loss.
-
get_metrics(self, reset: bool = False) → Dict[str, float][source]¶ Returns a dictionary of metrics. This method will be called by
allennlp.training.Trainerin order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible withMetrics should be populated during the call to ``forward`, with theMetrichandling the accumulation of the metric until this method is called.
-
class
allennlp.models.semantic_parsing.nlvr.nlvr_direct_semantic_parser.NlvrDirectSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, decoder_beam_search: allennlp.state_machines.beam_search.BeamSearch, max_decoding_steps: int, dropout: float = 0.0)[source]¶ Bases:
allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParserNlvrDirectSemanticParseris anNlvrSemanticParserthat gets around the problem of lack of logical form annotations by maximizing the marginal likelihood of an approximate set of target sequences that yield the correct denotation. The main difference between this parser andNlvrCoverageSemanticParseris that while this parser takes the output of an offline search process as the set of target sequences for training, the latter performs search during training.- Parameters
- vocab
Vocabulary Passed to super-class.
- sentence_embedder
TextFieldEmbedder Passed to super-class.
- action_embedding_dim
int Passed to super-class.
- encoder
Seq2SeqEncoder Passed to super-class.
- attention
Attention We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.
- decoder_beam_search
BeamSearch Beam search used to retrieve best sequences after training.
- max_decoding_steps
int Maximum number of steps for beam search after training.
- dropout
float, optional (default=0.0) Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.
- vocab
-
forward(self, sentence: Dict[str, torch.LongTensor], worlds: List[List[allennlp.semparse.domain_languages.nlvr_language.NlvrLanguage]], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], identifier: List[str] = None, target_action_sequences: torch.LongTensor = None, labels: torch.LongTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ Decoder logic for producing type constrained target sequences, trained to maximize marginal likelihod over a set of approximate logical forms.
-
get_metrics(self, reset: bool = False) → Dict[str, float][source]¶ Returns a dictionary of metrics. This method will be called by
allennlp.training.Trainerin order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible withMetrics should be populated during the call to ``forward`, with theMetrichandling the accumulation of the metric until this method is called.