allennlp.semparse.worlds¶
-
class
allennlp.semparse.worlds.world.
World
(constant_type_prefixes: Dict[str, nltk.sem.logic.BasicType] = None, global_type_signatures: Dict[str, nltk.sem.logic.Type] = None, global_name_mapping: Dict[str, str] = None, num_nested_lambdas: int = 0)[source]¶ Bases:
object
Base class for defining a world in a new domain. This class defines a method to translate a logical form as per a naming convention that works with NLTK’s
LogicParser
. The sub-classes can decide on the convention by overriding the_map_name
method that does token level mapping. This class also defines methods for transforming logical form strings into parsedExpressions
, andExpressions
into action sequences.- Parameters
- constant_type_prefixes
Dict[str, BasicType]
(optional) If you have an unbounded number of constants in your domain, you are required to add prefixes to their names to denote their types. This is the mapping from prefixes to types.
- global_type_signatures
Dict[str, Type]
(optional) A mapping from translated names to their types.
- global_name_mapping
Dict[str, str]
(optional) A name mapping from the original names in the domain to the translated names.
- num_nested_lambdas
int
(optional) Does the language used in this
World
permit lambda expressions? And if so, how many nested lambdas do we need to worry about? This is important when considering the space of all possible actions, which we need to enumerate a priori for the parser.
- constant_type_prefixes
-
get_action_sequence
(self, expression: nltk.sem.logic.Expression) → List[str][source]¶ Returns the sequence of actions (as strings) that resulted in the given expression.
-
get_basic_types
(self) → Set[nltk.sem.logic.Type][source]¶ Returns the set of basic types (types of entities) in the world.
-
get_logical_form
(self, action_sequence: List[str], add_var_function: bool = True) → str[source]¶ Takes an action sequence and constructs a logical form from it. This is useful if you want to get a logical form from a decoded sequence of actions generated by a transition based semantic parser.
- Parameters
- action_sequence
List[str]
The sequence of actions as strings (eg.:
['{START_SYMBOL} -> t', 't -> <e,t>', ...]
).- add_var_function
bool
(optional) var
is a special function that some languages use within lambda functions to indicate the use of a variable (eg.:(lambda x (fb:row.row.year (var x)))
). Due to the way constrained decoding is currently implemented, it is easier for the decoder to not produce these functions. In that case, setting this flag adds the function in the logical form even though it is not present in the action sequence.
- action_sequence
-
get_multi_match_mapping
(self) → Dict[nltk.sem.logic.Type, List[nltk.sem.logic.Type]][source]¶ Returns a mapping from each MultiMatchNamedBasicType to all the NamedBasicTypes that it matches.
-
get_paths_to_root
(self, action: str, max_path_length: int = 20, beam_size: int = 30, max_num_paths: int = 10) → List[List[str]][source]¶ For a given action, returns at most
max_num_paths
paths to the root (production withSTART_SYMBOL
) that are not longer thanmax_path_length
.
-
get_valid_starting_types
(self) → Set[nltk.sem.logic.Type][source]¶ Returns the set of all types t, such that actions
{START_SYMBOL} -> t
are valid. In other words, these are all the possible types of complete logical forms in this world.
-
is_terminal
(self, symbol: str) → bool[source]¶ This function will be called on nodes of a logical form tree, which are either non-terminal symbols that can be expanded or terminal symbols that must be leaf nodes. Returns
True
if the given symbol is a terminal symbol.
-
parse_logical_form
(self, logical_form: str, remove_var_function: bool = True) → nltk.sem.logic.Expression[source]¶ Takes a logical form as a string, maps its tokens using the mapping and returns a parsed expression.
- Parameters
- logical_form
str
Logical form to parse
- remove_var_function
bool
(optional) var
is a special function that some languages use within lambda functions to indicate the usage of a variable. If your language uses it, and you do not want to include it in the parsed expression, set this flag. You may want to do this if you are generating an action sequence from this parsed expression, because it is easier to let the decoder not produce this function due to the way constrained decoding is currently implemented.
- logical_form
-
class
allennlp.semparse.worlds.atis_world.
AtisWorld
(utterances: List[str], tokenizer: allennlp.data.tokenizers.tokenizer.Tokenizer = None)[source]¶ Bases:
object
World representation for the Atis SQL domain. This class has a
SqlTableContext
which holds the base grammar, it then augments this grammar by constraining each column to the values that are allowed in it.- Parameters
- utterances: ``List[str]``
A list of utterances in the interaction, the last element in this list is the current utterance that we are interested in.
- tokenizer: ``Tokenizer``, optional (default=``WordTokenizer()``)
We use this tokenizer to tokenize the utterances.
-
add_dates_to_number_linking_scores
(self, number_linking_scores: Dict[str, Tuple[str, str, List[int]]], current_tokenized_utterance: List[allennlp.data.tokenizers.token.Token]) → None[source]¶
-
add_to_number_linking_scores
(self, all_numbers: Set[str], number_linking_scores: Dict[str, Tuple[str, str, List[int]]], get_number_linking_dict: Callable[[str, List[allennlp.data.tokenizers.token.Token]], Dict[str, List[int]]], current_tokenized_utterance: List[allennlp.data.tokenizers.token.Token], nonterminal: str) → None[source]¶ This is a helper method for adding different types of numbers (eg. starting time ranges) as entities. We first go through all utterances in the interaction and find the numbers of a certain type and add them to the set
all_numbers
, which is initialized with default values. We want to add all numbers that occur in the interaction, and not just the current turn because the query could contain numbers that were triggered before the current turn. For each entity, we then check if it is triggered by tokens in the current utterance and construct the linking score.
-
all_possible_actions
(self) → List[str][source]¶ Return a sorted list of strings representing all possible actions of the form: nonterminal -> [right_hand_side]
-
database_file
= 'https://allennlp.s3.amazonaws.com/datasets/atis/atis.db'¶
-
sql_table_context
= None¶
-
allennlp.semparse.worlds.atis_world.
get_strings_from_utterance
(tokenized_utterance: List[allennlp.data.tokenizers.token.Token]) → Dict[str, List[int]][source]¶ Based on the current utterance, return a dictionary where the keys are the strings in the database that map to lists of the token indices that they are linked to.
-
class
allennlp.semparse.worlds.text2sql_world.
Text2SqlWorld
(schema_path: str, cursor: sqlite3.Cursor = None, use_prelinked_entities: bool = True, variable_free: bool = True, use_untyped_entities: bool = False)[source]¶ Bases:
object
World representation for any of the Text2Sql datasets.
- Parameters
- schema_path: ``str``
A path to a schema file which we read into a dictionary representing the SQL tables in the dataset, the keys are the names of the tables that map to lists of the table’s column names.
- cursor
Cursor
, optional (default = None) An optional cursor for a database, which is used to add database values to the grammar.
- use_prelinked_entities
bool
, (default = True) Whether or not to use the pre-linked entities from the text2sql data. We take this parameter here because it effects whether we need to add table values to the grammar.
- variable_free
bool
, optional (default = True) Denotes whether the data being parsed by the grammar is variable free. If it is, the grammar is modified to be less expressive by removing elements which are not necessary if the data is variable free.
- use_untyped_entities
bool
, optional (default = False) Whether or not to try to infer the types of prelinked variables. If not, they are added as untyped values to the grammar instead.
This module defines QuarelWorld, with a simple domain theory for reasoning about qualitative relations.
-
class
allennlp.semparse.worlds.quarel_world.
QuarelWorld
(table_graph: allennlp.semparse.contexts.knowledge_graph.KnowledgeGraph, syntax: str, qr_coeff_sets: List[Dict[str, int]] = None)[source]¶ Bases:
allennlp.semparse.worlds.world.World
Class defining the QuaRel domain theory world.
-
execute
(self, lf_raw: str) → int[source]¶ Very basic model for executing friction logical forms. For now returns answer index (or -1 if no answer can be concluded)
-
get_basic_types
(self) → Set[nltk.sem.logic.Type][source]¶ Returns the set of basic types (types of entities) in the world.
-
get_valid_starting_types
(self) → Set[nltk.sem.logic.Type][source]¶ Returns the set of all types t, such that actions
{START_SYMBOL} -> t
are valid. In other words, these are all the possible types of complete logical forms in this world.
-
is_table_entity
(self, entity_name: str) → bool[source]¶ Returns
True
if the given entity is one of the entities in the table.
-
qr_coeff_sets_default
= [{'friction': 1, 'speed': -1, 'smoothness': -1, 'distance': -1, 'heat': 1}, {'speed': 1, 'time': -1}, {'speed': 1, 'distance': 1}, {'time': 1, 'distance': 1}, {'weight': 1, 'acceleration': -1}, {'strength': 1, 'distance': 1}, {'strength': 1, 'thickness': 1}, {'mass': 1, 'gravity': 1}, {'flexibility': 1, 'breakability': -1}, {'distance': 1, 'loudness': -1, 'brightness': -1, 'apparentSize': -1}, {'exerciseIntensity': 1, 'amountSweat': 1}]¶
-
qr_size
= {'high': 1, 'higher': 1, 'low': -1, 'lower': -1}¶
-