allennlp.models.semantic_parsing.wikitables¶

class allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, max_decoding_steps: int, add_action_bias: bool = True, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels')[source]¶

Bases: allennlp.models.model.Model

A WikiTablesSemanticParser is a Model which takes as input a table and a question, and produces a logical form that answers the question when executed over the table. The logical form is generated by a type-constrained, transition-based parser. This is an abstract class that defines most of the functionality related to the transition-based parser. It does not contain the implementation for actually training the parser. You may want to train it using a learning-to-search algorithm, in which case you will want to use WikiTablesErmSemanticParser, or if you have a set of approximate logical forms that give the correct denotation, you will want to use WikiTablesMmlSemanticParser.

Parameters

vocabVocabulary
question_embedderTextFieldEmbedder: Embedder for questions.
action_embedding_dimint: Dimension to use for action embeddings.
encoderSeq2SeqEncoder: The encoder to use for the input question.
entity_encoderSeq2VecEncoder: The encoder to used for averaging the words of an entity.
max_decoding_stepsint: When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training.
add_action_biasbool, optional (default=True): If True, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding.
use_neighbor_similarity_for_linkingbool, optional (default=False): If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature.
dropoutfloat, optional (default=0): If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer).
num_linking_featuresint, optional (default=10): We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 8 here matches the default in the KnowledgeGraphField, which is to use all eight defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question.
rule_namespacestr, optional (default=rule_labels): The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.

decode(self, output_dict: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶

This method overrides Model.decode, which gets called after Model.forward, at test time, to finalize predictions. This is (confusingly) a separate notion from the “decoder” in “encoder/decoder”, where that decoder logic lives in the TransitionFunction.

This method trims the output predictions to the first end symbol, replaces indices with corresponding tokens, and adds a field called predicted_tokens to the output_dict.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: We track three metrics here:

1. lf_retrieval_acc, which is the percentage of the time that our best output action sequence is in the set of action sequences provided by offline search. This is an easy-to-compute lower bound on denotation accuracy for the set of examples where we actually have offline output. We only score lf_retrieval_acc on that subset.

2. denotation_acc, which is the percentage of examples where we get the correct denotation. This is the typical “accuracy” metric, and it is what you should usually report in an experimental result. You need to be careful, though, that you’re computing this on the full data, and not just the subset that has DPD output (make sure you pass “keep_if_no_dpd=True” to the dataset reader, which we do for validation data, but not training data).

3. lf_percent, which is the percentage of time that decoding actually produces a finished logical form. We might not produce a valid logical form if the decoder gets into a repetitive loop, or we’re trying to produce a super long logical form and run out of time steps, or something.

class allennlp.models.semantic_parsing.wikitables.wikitables_mml_semantic_parser.WikiTablesMmlSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, decoder_beam_search: allennlp.state_machines.beam_search.BeamSearch, max_decoding_steps: int, attention: allennlp.modules.attention.attention.Attention, mixture_feedforward: allennlp.modules.feedforward.FeedForward = None, add_action_bias: bool = True, training_beam_size: int = None, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels')[source]¶

Bases: allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser

A WikiTablesMmlSemanticParser is a WikiTablesSemanticParser which is trained to maximize the marginal likelihood of an approximate set of logical forms which give the correct denotation. This is a re-implementation of the model used for the paper Neural Semantic Parsing with Type Constraints for Semi-Structured Tables, by Jayant Krishnamurthy, Pradeep Dasigi, and Matt Gardner (EMNLP 2017). The language used by this model is different from LambdaDCS, the one in the paper above though. This model uses the variable free language from allennlp.semparse.domain_languages.wikitables_language.

Parameters

vocabVocabulary
question_embedderTextFieldEmbedder: Embedder for questions. Passed to super class.
action_embedding_dimint: Dimension to use for action embeddings. Passed to super class.
encoderSeq2SeqEncoder: The encoder to use for the input question. Passed to super class.
entity_encoderSeq2VecEncoder: The encoder to used for averaging the words of an entity. Passed to super class.
decoder_beam_searchBeamSearch: When we’re not training, this is how we will do decoding.
max_decoding_stepsint: When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training. Passed to super class.
attentionAttention: We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the transition function.
mixture_feedforwardFeedForward, optional (default=None): If given, we’ll use this to compute a mixture probability between global actions and linked actions given the hidden state at every timestep of decoding, instead of concatenating the logits for both (where the logits may not be compatible with each other). Passed to the transition function.
add_action_biasbool, optional (default=True): If True, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding. Passed to super class.
training_beam_sizeint, optional (default=None): If given, we will use a constrained beam search of this size during training, so that we use only the top training_beam_size action sequences according to the model in the MML computation. If this is None, we will use all of the provided action sequences in the MML computation.
use_neighbor_similarity_for_linkingbool, optional (default=False): If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature. Passed to super class.
dropoutfloat, optional (default=0): If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.
num_linking_featuresint, optional (default=10): We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the KnowledgeGraphField, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.
rule_namespacestr, optional (default=rule_labels): The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.

forward(self, question: Dict[str, torch.LongTensor], table: Dict[str, torch.LongTensor], world: List[allennlp.semparse.domain_languages.wikitables_language.WikiTablesLanguage], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], target_values: List[List[str]] = None, target_action_sequences: torch.LongTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

In this method we encode the table entities, link them to words in the question, then encode the question. Then we set up the initial state for the decoder, and pass that state off to either a DecoderTrainer, if we’re training, or a BeamSearch for inference, if we’re not.

Parameters

questionDict[str, torch.LongTensor]: The output of TextField.as_array() applied on the question TextField. This will be passed through a TextFieldEmbedder and then through an encoder.
tableDict[str, torch.LongTensor]: The output of KnowledgeGraphField.as_array() applied on the table KnowledgeGraphField. This output is similar to a TextField output, where each entity in the table is treated as a “token”, and we will use a TextFieldEmbedder to get embeddings for each entity.
worldList[WikiTablesLanguage]: We use a MetadataField to get the WikiTablesLanguage object for each input instance. Because of how MetadataField works, this gets passed to us as a List[WikiTablesLanguage],
actionsList[List[ProductionRuleArray]]: A list of all possible actions for each world in the batch, indexed into a ProductionRuleArray using a ProductionRuleField. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.
target_valuesList[List[str]], optional (default = None): For each instance, a list of target values taken from the example lisp string. We pass this list to the evaluator along with logical forms to compute denotation accuracy.
target_action_sequencestorch.Tensor, optional (default = None): A list of possibly valid action sequences, where each action is an index into the list of possible actions. This tensor has shape (batch_size, num_action_sequences, sequence_length).
metadataList[Dict[str, Any]], optional (default = None): Metadata containing the original tokenized question within a ‘question_tokens’ field.

class allennlp.models.semantic_parsing.wikitables.wikitables_erm_semantic_parser.WikiTablesErmSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, attention: allennlp.modules.attention.attention.Attention, decoder_beam_size: int, decoder_num_finished_states: int, max_decoding_steps: int, mixture_feedforward: allennlp.modules.feedforward.FeedForward = None, add_action_bias: bool = True, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels', mml_model_file: str = None)[source]¶

Bases: allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser

A WikiTablesErmSemanticParser is a WikiTablesSemanticParser that learns to search for logical forms that yield the correct denotations.

Parameters

vocabVocabulary
question_embedderTextFieldEmbedder: Embedder for questions. Passed to super class.
action_embedding_dimint: Dimension to use for action embeddings. Passed to super class.
encoderSeq2SeqEncoder: The encoder to use for the input question. Passed to super class.
entity_encoderSeq2VecEncoder: The encoder to used for averaging the words of an entity. Passed to super class.
attentionAttention: We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the transition function.
decoder_beam_sizeint: Beam size to be used by the ExpectedRiskMinimization algorithm.
decoder_num_finished_statesint: Number of finished states for which costs will be computed by the ExpectedRiskMinimization algorithm.
max_decoding_stepsint: Maximum number of steps the decoder should take before giving up. Used both during training and evaluation. Passed to super class.
add_action_biasbool, optional (default=True): If True, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding. Passed to super class.
normalize_beam_score_by_lengthbool, optional (default=False): Should we normalize the log-probabilities by length before renormalizing the beam? This was shown to work better for NML by Edunov et al., but that many not be the case for semantic parsing.
checklist_cost_weightfloat, optional (default=0.6): Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.
use_neighbor_similarity_for_linkingbool, optional (default=False): If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature. Passed to super class.
dropoutfloat, optional (default=0): If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.
num_linking_featuresint, optional (default=10): We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the KnowledgeGraphField, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.
rule_namespacestr, optional (default=rule_labels): The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.
mml_model_filestr, optional (default=None): If you want to initialize this model using weights from another model trained using MML, pass the path to the model.tar.gz file of that model here.

forward(self, question: Dict[str, torch.LongTensor], table: Dict[str, torch.LongTensor], world: List[allennlp.semparse.domain_languages.wikitables_language.WikiTablesLanguage], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], agenda: torch.LongTensor, target_values: List[List[str]] = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶

Parameters

questionDict[str, torch.LongTensor]: The output of TextField.as_array() applied on the question TextField. This will be passed through a TextFieldEmbedder and then through an encoder.
tableDict[str, torch.LongTensor]: The output of KnowledgeGraphField.as_array() applied on the table KnowledgeGraphField. This output is similar to a TextField output, where each entity in the table is treated as a “token”, and we will use a TextFieldEmbedder to get embeddings for each entity.
worldList[WikiTablesLanguage]: We use a MetadataField to get the WikiTablesLanguage object for each input instance. Because of how MetadataField works, this gets passed to us as a List[WikiTablesLanguage],
actionsList[List[ProductionRule]]: A list of all possible actions for each world in the batch, indexed into a ProductionRule using a ProductionRuleField. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.
agendatorch.LongTensor: Agenda vectors that the checklist vectors will be compared against to compute the checklist cost.
target_valuesList[List[str]], optional (default = None): For each instance, a list of target values taken from the example lisp string. We pass this list to the evaluator along with logical forms to compute denotation accuracy.
metadataList[Dict[str, Any]], optional (default = None): Metadata containing the original tokenized question within a ‘question_tokens’ field.

get_metrics(self, reset: bool = False) → Dict[str, float][source]¶: The base class returns a dict with dpd accuracy, denotation accuracy, and logical form percentage metrics. We add the agenda coverage metric here.