allennlp.semparse.worlds

class allennlp.semparse.worlds.world.World(constant_type_prefixes: Dict[str, nltk.sem.logic.BasicType] = None, global_type_signatures: Dict[str, nltk.sem.logic.Type] = None, global_name_mapping: Dict[str, str] = None, num_nested_lambdas: int = 0)[source]

Bases: object

Base class for defining a world in a new domain. This class defines a method to translate a logical form as per a naming convention that works with NLTK’s LogicParser. The sub-classes can decide on the convention by overriding the _map_name method that does token level mapping. This class also defines methods for transforming logical form strings into parsed Expressions, and Expressions into action sequences.

Parameters
constant_type_prefixesDict[str, BasicType] (optional)

If you have an unbounded number of constants in your domain, you are required to add prefixes to their names to denote their types. This is the mapping from prefixes to types.

global_type_signaturesDict[str, Type] (optional)

A mapping from translated names to their types.

global_name_mappingDict[str, str] (optional)

A name mapping from the original names in the domain to the translated names.

num_nested_lambdasint (optional)

Does the language used in this World permit lambda expressions? And if so, how many nested lambdas do we need to worry about? This is important when considering the space of all possible actions, which we need to enumerate a priori for the parser.

all_possible_actions(self) → List[str][source]
get_action_sequence(self, expression: nltk.sem.logic.Expression) → List[str][source]

Returns the sequence of actions (as strings) that resulted in the given expression.

get_basic_types(self) → Set[nltk.sem.logic.Type][source]

Returns the set of basic types (types of entities) in the world.

get_logical_form(self, action_sequence: List[str], add_var_function: bool = True) → str[source]

Takes an action sequence and constructs a logical form from it. This is useful if you want to get a logical form from a decoded sequence of actions generated by a transition based semantic parser.

Parameters
action_sequenceList[str]

The sequence of actions as strings (eg.: ['{START_SYMBOL} -> t', 't -> <e,t>', ...]).

add_var_functionbool (optional)

var is a special function that some languages use within lambda functions to indicate the use of a variable (eg.: (lambda x (fb:row.row.year (var x)))). Due to the way constrained decoding is currently implemented, it is easier for the decoder to not produce these functions. In that case, setting this flag adds the function in the logical form even though it is not present in the action sequence.

get_multi_match_mapping(self) → Dict[nltk.sem.logic.Type, List[nltk.sem.logic.Type]][source]

Returns a mapping from each MultiMatchNamedBasicType to all the NamedBasicTypes that it matches.

get_name_mapping(self) → Dict[str, str][source]
get_paths_to_root(self, action: str, max_path_length: int = 20, beam_size: int = 30, max_num_paths: int = 10) → List[List[str]][source]

For a given action, returns at most max_num_paths paths to the root (production with START_SYMBOL) that are not longer than max_path_length.

get_type_signatures(self) → Dict[str, str][source]
get_valid_actions(self) → Dict[str, List[str]][source]
get_valid_starting_types(self) → Set[nltk.sem.logic.Type][source]

Returns the set of all types t, such that actions {START_SYMBOL} -> t are valid. In other words, these are all the possible types of complete logical forms in this world.

is_terminal(self, symbol: str) → bool[source]

This function will be called on nodes of a logical form tree, which are either non-terminal symbols that can be expanded or terminal symbols that must be leaf nodes. Returns True if the given symbol is a terminal symbol.

parse_logical_form(self, logical_form: str, remove_var_function: bool = True) → nltk.sem.logic.Expression[source]

Takes a logical form as a string, maps its tokens using the mapping and returns a parsed expression.

Parameters
logical_formstr

Logical form to parse

remove_var_functionbool (optional)

var is a special function that some languages use within lambda functions to indicate the usage of a variable. If your language uses it, and you do not want to include it in the parsed expression, set this flag. You may want to do this if you are generating an action sequence from this parsed expression, because it is easier to let the decoder not produce this function due to the way constrained decoding is currently implemented.

class allennlp.semparse.worlds.atis_world.AtisWorld(utterances: List[str], tokenizer: allennlp.data.tokenizers.tokenizer.Tokenizer = None)[source]

Bases: object

World representation for the Atis SQL domain. This class has a SqlTableContext which holds the base grammar, it then augments this grammar by constraining each column to the values that are allowed in it.

Parameters
utterances: ``List[str]``

A list of utterances in the interaction, the last element in this list is the current utterance that we are interested in.

tokenizer: ``Tokenizer``, optional (default=``WordTokenizer()``)

We use this tokenizer to tokenize the utterances.

add_dates_to_number_linking_scores(self, number_linking_scores: Dict[str, Tuple[str, str, List[int]]], current_tokenized_utterance: List[allennlp.data.tokenizers.token.Token]) → None[source]
add_to_number_linking_scores(self, all_numbers: Set[str], number_linking_scores: Dict[str, Tuple[str, str, List[int]]], get_number_linking_dict: Callable[[str, List[allennlp.data.tokenizers.token.Token]], Dict[str, List[int]]], current_tokenized_utterance: List[allennlp.data.tokenizers.token.Token], nonterminal: str) → None[source]

This is a helper method for adding different types of numbers (eg. starting time ranges) as entities. We first go through all utterances in the interaction and find the numbers of a certain type and add them to the set all_numbers, which is initialized with default values. We want to add all numbers that occur in the interaction, and not just the current turn because the query could contain numbers that were triggered before the current turn. For each entity, we then check if it is triggered by tokens in the current utterance and construct the linking score.

all_possible_actions(self) → List[str][source]

Return a sorted list of strings representing all possible actions of the form: nonterminal -> [right_hand_side]

database_file = 'https://allennlp.s3.amazonaws.com/datasets/atis/atis.db'
get_action_sequence(self, query: str) → List[str][source]
get_valid_actions(self) → Dict[str, List[str]][source]
sql_table_context = None
allennlp.semparse.worlds.atis_world.get_strings_from_utterance(tokenized_utterance: List[allennlp.data.tokenizers.token.Token]) → Dict[str, List[int]][source]

Based on the current utterance, return a dictionary where the keys are the strings in the database that map to lists of the token indices that they are linked to.

class allennlp.semparse.worlds.text2sql_world.Text2SqlWorld(schema_path: str, cursor: sqlite3.Cursor = None, use_prelinked_entities: bool = True, variable_free: bool = True, use_untyped_entities: bool = False)[source]

Bases: object

World representation for any of the Text2Sql datasets.

Parameters
schema_path: ``str``

A path to a schema file which we read into a dictionary representing the SQL tables in the dataset, the keys are the names of the tables that map to lists of the table’s column names.

cursorCursor, optional (default = None)

An optional cursor for a database, which is used to add database values to the grammar.

use_prelinked_entitiesbool, (default = True)

Whether or not to use the pre-linked entities from the text2sql data. We take this parameter here because it effects whether we need to add table values to the grammar.

variable_freebool, optional (default = True)

Denotes whether the data being parsed by the grammar is variable free. If it is, the grammar is modified to be less expressive by removing elements which are not necessary if the data is variable free.

use_untyped_entitiesbool, optional (default = False)

Whether or not to try to infer the types of prelinked variables. If not, they are added as untyped values to the grammar instead.

get_action_sequence_and_all_actions(self, query: List[str] = None, prelinked_entities: Dict[str, Dict[str, str]] = None) → Tuple[List[str], List[str]][source]
is_global_rule(self, production_rule: str) → bool[source]

This module defines QuarelWorld, with a simple domain theory for reasoning about qualitative relations.

class allennlp.semparse.worlds.quarel_world.QuarelWorld(table_graph: allennlp.semparse.contexts.knowledge_graph.KnowledgeGraph, syntax: str, qr_coeff_sets: List[Dict[str, int]] = None)[source]

Bases: allennlp.semparse.worlds.world.World

Class defining the QuaRel domain theory world.

execute(self, lf_raw: str) → int[source]

Very basic model for executing friction logical forms. For now returns answer index (or -1 if no answer can be concluded)

get_basic_types(self) → Set[nltk.sem.logic.Type][source]

Returns the set of basic types (types of entities) in the world.

get_valid_starting_types(self) → Set[nltk.sem.logic.Type][source]

Returns the set of all types t, such that actions {START_SYMBOL} -> t are valid. In other words, these are all the possible types of complete logical forms in this world.

is_table_entity(self, entity_name: str) → bool[source]

Returns True if the given entity is one of the entities in the table.

qr_coeff_sets_default = [{'friction': 1, 'speed': -1, 'smoothness': -1, 'distance': -1, 'heat': 1}, {'speed': 1, 'time': -1}, {'speed': 1, 'distance': 1}, {'time': 1, 'distance': 1}, {'weight': 1, 'acceleration': -1}, {'strength': 1, 'distance': 1}, {'strength': 1, 'thickness': 1}, {'mass': 1, 'gravity': 1}, {'flexibility': 1, 'breakability': -1}, {'distance': 1, 'loudness': -1, 'brightness': -1, 'apparentSize': -1}, {'exerciseIntensity': 1, 'amountSweat': 1}]
qr_size = {'high': 1, 'higher': 1, 'low': -1, 'lower': -1}