allennlp.models.semantic_parsing.wikitables¶
-
class
allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.
WikiTablesSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, max_decoding_steps: int, add_action_bias: bool = True, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels')[source]¶ Bases:
allennlp.models.model.Model
A
WikiTablesSemanticParser
is aModel
which takes as input a table and a question, and produces a logical form that answers the question when executed over the table. The logical form is generated by a type-constrained, transition-based parser. This is an abstract class that defines most of the functionality related to the transition-based parser. It does not contain the implementation for actually training the parser. You may want to train it using a learning-to-search algorithm, in which case you will want to useWikiTablesErmSemanticParser
, or if you have a set of approximate logical forms that give the correct denotation, you will want to useWikiTablesMmlSemanticParser
.- Parameters
- vocab
Vocabulary
- question_embedder
TextFieldEmbedder
Embedder for questions.
- action_embedding_dim
int
Dimension to use for action embeddings.
- encoder
Seq2SeqEncoder
The encoder to use for the input question.
- entity_encoder
Seq2VecEncoder
The encoder to used for averaging the words of an entity.
- max_decoding_steps
int
When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training.
- add_action_bias
bool
, optional (default=True) If
True
, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding.- use_neighbor_similarity_for_linking
bool
, optional (default=False) If
True
, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as therelated_column
feature.- dropout
float
, optional (default=0) If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer).
- num_linking_features
int
, optional (default=10) We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 8 here matches the default in the
KnowledgeGraphField
, which is to use all eight defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question.- rule_namespace
str
, optional (default=rule_labels) The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.
- vocab
-
decode
(self, output_dict: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶ This method overrides
Model.decode
, which gets called afterModel.forward
, at test time, to finalize predictions. This is (confusingly) a separate notion from the “decoder” in “encoder/decoder”, where that decoder logic lives in theTransitionFunction
.This method trims the output predictions to the first end symbol, replaces indices with corresponding tokens, and adds a field called
predicted_tokens
to theoutput_dict
.
-
get_metrics
(self, reset: bool = False) → Dict[str, float][source]¶ We track three metrics here:
1. lf_retrieval_acc, which is the percentage of the time that our best output action sequence is in the set of action sequences provided by offline search. This is an easy-to-compute lower bound on denotation accuracy for the set of examples where we actually have offline output. We only score lf_retrieval_acc on that subset.
2. denotation_acc, which is the percentage of examples where we get the correct denotation. This is the typical “accuracy” metric, and it is what you should usually report in an experimental result. You need to be careful, though, that you’re computing this on the full data, and not just the subset that has DPD output (make sure you pass “keep_if_no_dpd=True” to the dataset reader, which we do for validation data, but not training data).
3. lf_percent, which is the percentage of time that decoding actually produces a finished logical form. We might not produce a valid logical form if the decoder gets into a repetitive loop, or we’re trying to produce a super long logical form and run out of time steps, or something.
-
class
allennlp.models.semantic_parsing.wikitables.wikitables_mml_semantic_parser.
WikiTablesMmlSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, decoder_beam_search: allennlp.state_machines.beam_search.BeamSearch, max_decoding_steps: int, attention: allennlp.modules.attention.attention.Attention, mixture_feedforward: allennlp.modules.feedforward.FeedForward = None, add_action_bias: bool = True, training_beam_size: int = None, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels')[source]¶ Bases:
allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser
A
WikiTablesMmlSemanticParser
is aWikiTablesSemanticParser
which is trained to maximize the marginal likelihood of an approximate set of logical forms which give the correct denotation. This is a re-implementation of the model used for the paper Neural Semantic Parsing with Type Constraints for Semi-Structured Tables, by Jayant Krishnamurthy, Pradeep Dasigi, and Matt Gardner (EMNLP 2017). The language used by this model is different from LambdaDCS, the one in the paper above though. This model uses the variable free language fromallennlp.semparse.domain_languages.wikitables_language
.- Parameters
- vocab
Vocabulary
- question_embedder
TextFieldEmbedder
Embedder for questions. Passed to super class.
- action_embedding_dim
int
Dimension to use for action embeddings. Passed to super class.
- encoder
Seq2SeqEncoder
The encoder to use for the input question. Passed to super class.
- entity_encoder
Seq2VecEncoder
The encoder to used for averaging the words of an entity. Passed to super class.
- decoder_beam_search
BeamSearch
When we’re not training, this is how we will do decoding.
- max_decoding_steps
int
When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training. Passed to super class.
- attention
Attention
We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the transition function.
- mixture_feedforward
FeedForward
, optional (default=None) If given, we’ll use this to compute a mixture probability between global actions and linked actions given the hidden state at every timestep of decoding, instead of concatenating the logits for both (where the logits may not be compatible with each other). Passed to the transition function.
- add_action_bias
bool
, optional (default=True) If
True
, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding. Passed to super class.- training_beam_size
int
, optional (default=None) If given, we will use a constrained beam search of this size during training, so that we use only the top
training_beam_size
action sequences according to the model in the MML computation. If this isNone
, we will use all of the provided action sequences in the MML computation.- use_neighbor_similarity_for_linking
bool
, optional (default=False) If
True
, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as therelated_column
feature. Passed to super class.- dropout
float
, optional (default=0) If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.
- num_linking_features
int
, optional (default=10) We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the
KnowledgeGraphField
, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.- rule_namespace
str
, optional (default=rule_labels) The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.
- vocab
-
forward
(self, question: Dict[str, torch.LongTensor], table: Dict[str, torch.LongTensor], world: List[allennlp.semparse.domain_languages.wikitables_language.WikiTablesLanguage], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], target_values: List[List[str]] = None, target_action_sequences: torch.LongTensor = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ In this method we encode the table entities, link them to words in the question, then encode the question. Then we set up the initial state for the decoder, and pass that state off to either a DecoderTrainer, if we’re training, or a BeamSearch for inference, if we’re not.
- Parameters
- questionDict[str, torch.LongTensor]
The output of
TextField.as_array()
applied on the questionTextField
. This will be passed through aTextFieldEmbedder
and then through an encoder.- table
Dict[str, torch.LongTensor]
The output of
KnowledgeGraphField.as_array()
applied on the tableKnowledgeGraphField
. This output is similar to aTextField
output, where each entity in the table is treated as a “token”, and we will use aTextFieldEmbedder
to get embeddings for each entity.- world
List[WikiTablesLanguage]
We use a
MetadataField
to get theWikiTablesLanguage
object for each input instance. Because of howMetadataField
works, this gets passed to us as aList[WikiTablesLanguage]
,- actions
List[List[ProductionRuleArray]]
A list of all possible actions for each
world
in the batch, indexed into aProductionRuleArray
using aProductionRuleField
. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.- target_values
List[List[str]]
, optional (default = None) For each instance, a list of target values taken from the example lisp string. We pass this list to the evaluator along with logical forms to compute denotation accuracy.
- target_action_sequencestorch.Tensor, optional (default = None)
A list of possibly valid action sequences, where each action is an index into the list of possible actions. This tensor has shape
(batch_size, num_action_sequences, sequence_length)
.- metadata
List[Dict[str, Any]]
, optional (default = None) Metadata containing the original tokenized question within a ‘question_tokens’ field.
-
class
allennlp.models.semantic_parsing.wikitables.wikitables_erm_semantic_parser.
WikiTablesErmSemanticParser
(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, attention: allennlp.modules.attention.attention.Attention, decoder_beam_size: int, decoder_num_finished_states: int, max_decoding_steps: int, mixture_feedforward: allennlp.modules.feedforward.FeedForward = None, add_action_bias: bool = True, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels', mml_model_file: str = None)[source]¶ Bases:
allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser
A
WikiTablesErmSemanticParser
is aWikiTablesSemanticParser
that learns to search for logical forms that yield the correct denotations.- Parameters
- vocab
Vocabulary
- question_embedder
TextFieldEmbedder
Embedder for questions. Passed to super class.
- action_embedding_dim
int
Dimension to use for action embeddings. Passed to super class.
- encoder
Seq2SeqEncoder
The encoder to use for the input question. Passed to super class.
- entity_encoder
Seq2VecEncoder
The encoder to used for averaging the words of an entity. Passed to super class.
- attention
Attention
We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the transition function.
- decoder_beam_size
int
Beam size to be used by the ExpectedRiskMinimization algorithm.
- decoder_num_finished_states
int
Number of finished states for which costs will be computed by the ExpectedRiskMinimization algorithm.
- max_decoding_steps
int
Maximum number of steps the decoder should take before giving up. Used both during training and evaluation. Passed to super class.
- add_action_bias
bool
, optional (default=True) If
True
, we will learn a bias weight for each action that gets used when predicting that action, in addition to its embedding. Passed to super class.- normalize_beam_score_by_length
bool
, optional (default=False) Should we normalize the log-probabilities by length before renormalizing the beam? This was shown to work better for NML by Edunov et al., but that many not be the case for semantic parsing.
- checklist_cost_weight
float
, optional (default=0.6) Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.
- use_neighbor_similarity_for_linking
bool
, optional (default=False) If
True
, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as therelated_column
feature. Passed to super class.- dropout
float
, optional (default=0) If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.
- num_linking_features
int
, optional (default=10) We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the
KnowledgeGraphField
, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.- rule_namespace
str
, optional (default=rule_labels) The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.
- mml_model_file
str
, optional (default=None) If you want to initialize this model using weights from another model trained using MML, pass the path to the
model.tar.gz
file of that model here.
- vocab
-
forward
(self, question: Dict[str, torch.LongTensor], table: Dict[str, torch.LongTensor], world: List[allennlp.semparse.domain_languages.wikitables_language.WikiTablesLanguage], actions: List[List[allennlp.data.fields.production_rule_field.ProductionRule]], agenda: torch.LongTensor, target_values: List[List[str]] = None, metadata: List[Dict[str, Any]] = None) → Dict[str, torch.Tensor][source]¶ - Parameters
- questionDict[str, torch.LongTensor]
The output of
TextField.as_array()
applied on the questionTextField
. This will be passed through aTextFieldEmbedder
and then through an encoder.- table
Dict[str, torch.LongTensor]
The output of
KnowledgeGraphField.as_array()
applied on the tableKnowledgeGraphField
. This output is similar to aTextField
output, where each entity in the table is treated as a “token”, and we will use aTextFieldEmbedder
to get embeddings for each entity.- world
List[WikiTablesLanguage]
We use a
MetadataField
to get theWikiTablesLanguage
object for each input instance. Because of howMetadataField
works, this gets passed to us as aList[WikiTablesLanguage]
,- actions
List[List[ProductionRule]]
A list of all possible actions for each
world
in the batch, indexed into aProductionRule
using aProductionRuleField
. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.- agenda
torch.LongTensor
Agenda vectors that the checklist vectors will be compared against to compute the checklist cost.
- target_values
List[List[str]]
, optional (default = None) For each instance, a list of target values taken from the example lisp string. We pass this list to the evaluator along with logical forms to compute denotation accuracy.
- metadata
List[Dict[str, Any]]
, optional (default = None) Metadata containing the original tokenized question within a ‘question_tokens’ field.