allennlp.semparse.type_declarations

This module defines some classes that are generally useful for defining a type system for a new domain. We inherit the type logic in nltk.sem.logic and add some functionality on top of it here. There are two main improvements: 1) Firstly, we allow defining multiple basic types with their own names (see NamedBasicType). 2) Secondly, we allow defining function types that have placeholders in them (see PlaceholderType). We also extend NLTK’s LogicParser to define a DynamicTypeLogicParser that knows how to deal with the two improvements above.

class allennlp.semparse.type_declarations.type_declaration.BinaryOpType(type_: nltk.sem.logic.BasicType = ?, allowed_substitutions: Set[nltk.sem.logic.BasicType] = None, signature: str = '<#1, <#1, #1>>')[source]

Bases: allennlp.semparse.type_declarations.type_declaration.PlaceholderType

BinaryOpType is a function that takes two arguments of the same type and returns an argument of that type. +, -, and and or are examples of this kind of function. The type signature of BinaryOpType is <#1,<#1,#1>>.

Parameters
allowed_substitutionsSet[BasicType], optional (default=None)

If given, this sets restrictions on the types that can be substituted. That is, say you have a unary operation that is only permitted for numbers and dates, you can pass those in here, and we will only consider those types when calling substitute_any_type(). If this is None, all basic types are allowed.

signaturestr, optional (default=’<#1,<#1,#1>>’)

The signature of the operation is what will appear in action sequences that include this type. The default value is suitable for functions that apply to any type. If you have a restricted set of allowed substitutions, you likely want to change the type signature to reflect that.

get_application_type(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]

This method returns the resulting type when this type is applied as a function to an argument of the given type.

resolve(self, other: nltk.sem.logic.Type) → Union[nltk.sem.logic.Type, NoneType][source]

See PlaceholderType.resolve

substitute_any_type(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]

Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a NotImplementedError to be sure that subclasses think about what the right thing to do here is, and do it correctly.

class allennlp.semparse.type_declarations.type_declaration.ComplexType(first, second)[source]

Bases: nltk.sem.logic.ComplexType

In NLTK, a ComplexType is a function. These functions are curried, so if you need multiple arguments for your function you nest ComplexTypes. That currying makes things difficult for us, and we mitigate the problems by adding return_type and argument_type functions to ComplexType.

argument_types(self) → List[nltk.sem.logic.Type][source]

Gives the types of all arguments to this function. For functions returning a basic type, we grab all .first types until .second is no longer a ComplexType. That logic is implemented here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.

return_type(self) → nltk.sem.logic.Type[source]

Gives the final return type for this function. If the function takes a single argument, this is just self.second. If the function takes multiple arguments and returns a basic type, this should be the final .second after following all complex types. That is the implementation here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.

substitute_any_type(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]

Takes a set of BasicTypes and replaces any instances of ANY_TYPE inside this complex type with each of those basic types.

class allennlp.semparse.type_declarations.type_declaration.DynamicTypeApplicationExpression(function: nltk.sem.logic.Expression, argument: nltk.sem.logic.Expression, variables_with_placeholders: Set[str])[source]

Bases: nltk.sem.logic.ApplicationExpression

NLTK’s ApplicationExpression (which represents function applications like P(x)) has two limitations, which we overcome by inheriting from ApplicationExpression and overriding two methods.

Firstly, ApplicationExpression does not handle the case where P’s type involves placeholders (R, V, !=, etc.), which are special cases because their return types depend on the type of their arguments (x). We override the property type to redefine the type of the application.

Secondly, NLTK’s variables only bind to entities, and thus the variable types are ‘e’ by default. We get around this issue by replacing x with X, whose initial type is ANY_TYPE, and later gets resolved based on the type signature of the function whose scope the variable appears in. This variable binding operation is implemented by overriding _set_type below.

property type
class allennlp.semparse.type_declarations.type_declaration.DynamicTypeLogicParser(type_check: bool = True, constant_type_prefixes: Dict[str, nltk.sem.logic.BasicType] = None, type_signatures: Dict[str, nltk.sem.logic.Type] = None)[source]

Bases: nltk.sem.logic.LogicParser

DynamicTypeLogicParser is a LogicParser that can deal with NamedBasicType and PlaceholderType appropriately. Our extension here does two things differently.

Firstly, we should handle constants of different types. We do this by passing a dict of format {name_prefix: type} to the constructor. For example, your domain has entities of types unicorns and elves, and you have an entity “Phil” of type unicorn, and “Bob” of type “elf”. The names of the two entities should then be “unicorn:phil” and “elf:bob” respectively.

Secondly, since we defined a new kind of ApplicationExpression above, the LogicParser should be able to create this new kind of expression.

make_ApplicationExpression(self, function, argument)[source]
make_VariableExpression(self, name)[source]
class allennlp.semparse.type_declarations.type_declaration.HigherOrderType(num_arguments: int, first: nltk.sem.logic.Type, second: nltk.sem.logic.Type)[source]

Bases: allennlp.semparse.type_declarations.type_declaration.ComplexType

A higher-order function is a ComplexType that returns functions. We just override return_type and argument_types to make sure that these types are correct.

Parameters
num_argumentsint

How many arguments this function takes before returning a function. We’ll go through this many levels of nested ComplexTypes before returning the final .second as our return type.

firstType

Passed to NLTK’s ComplexType.

secondType

Passed to NLTK’s ComplexType.

argument_types(self) → List[nltk.sem.logic.Type][source]

Gives the types of all arguments to this function. For functions returning a basic type, we grab all .first types until .second is no longer a ComplexType. That logic is implemented here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.

return_type(self) → nltk.sem.logic.Type[source]

Gives the final return type for this function. If the function takes a single argument, this is just self.second. If the function takes multiple arguments and returns a basic type, this should be the final .second after following all complex types. That is the implementation here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.

class allennlp.semparse.type_declarations.type_declaration.MultiMatchNamedBasicType(string_rep, types_to_match: List[nltk.sem.logic.BasicType])[source]

Bases: allennlp.semparse.type_declarations.type_declaration.NamedBasicType

A NamedBasicType that matches with any type within a list of BasicTypes that it takes as an additional argument during instantiation. We just override the matches method in BasicType to match against any of the types given by the list.

Parameters
string_repstr

String representation of the type, passed to super class.

types_to_matchList[BasicType]

List of types that this type should match with.

matches(self, other)[source]
class allennlp.semparse.type_declarations.type_declaration.NameMapper(language_has_lambda: bool = False, alias_prefix: str = 'F')[source]

Bases: object

The LogicParser we use has some naming conventions for functions (i.e. they should start with an upper case letter, and the remaining characters can only be digits). This means that we have to internally represent functions with unintuitive names. This class will automatically give unique names following the convention, and populate central mappings with these names. If for some reason you need to manually define the alias, you can do so by passing an alias to map_name_with_signature.

Parameters
language_has_lambdabool (optional, default=False)

If your language has lambda functions, the word “lambda” needs to be in the name mapping, mapped to the alias “”. NLTK understands this symbol, and it doesn’t need a type signature for it. Setting this flag to True adds the mapping to name_mapping.

alias_prefixstr (optional, default=”F”)

The one letter prefix used for all aliases. You do not need to specify it if you have only instance of this class for you language. If not, you can specify a different prefix for each name mapping you use for your language.

get_alias(self, name: str) → str[source]
get_signature(self, name: str) → nltk.sem.logic.Type[source]
map_name_with_signature(self, name: str, signature: nltk.sem.logic.Type, alias: str = None) → None[source]
class allennlp.semparse.type_declarations.type_declaration.NamedBasicType(string_rep)[source]

Bases: nltk.sem.logic.BasicType

A BasicType that also takes the name of the type as an argument to its constructor. Type resolution uses the output of __str__ as well, so basic types with different representations do not resolve against each other.

Parameters
string_repstr

String representation of the type.

str(self)[source]
class allennlp.semparse.type_declarations.type_declaration.PlaceholderType(first, second)[source]

Bases: allennlp.semparse.type_declarations.type_declaration.ComplexType

PlaceholderType is a ComplexType that involves placeholders, and thus its type resolution is context sensitive. This is an abstract class for all placeholder types like reverse, and, or, argmax, etc.

Note that ANY_TYPE in NLTK’s type system doesn’t work like a wild card. Once the type of a variable gets resolved to a specific type, NLTK changes the type of that variable to that specific type. Hence, what NLTK calls “ANY_TYPE”, is essentially a “yet-to-be-decided” type. This is a problem because we may want the same variable to bind to different types within a logical form, and using ANY_TYPE for this purpose will cause a resolution failure. For example the count function may apply to both rows and cells in the same logical form, and making count of type ComplexType(ANY_TYPE, DATE_NUM_TYPE) will cause a resolution error. This class lets you define ComplexType s with placeholders that are actually wild cards.

The subclasses of this abstract class need to do three things 1) Override the property _signature to define the type signature (this is just the signature’s string representation and will not affect type inference or checking). You will see this signature in action sequences. 2) Override resolve to resolve the type appropriately (see the docstring in resolve for more information). 3) Override get_application_type which returns the return type when this type is applied as a function to an argument of a specified type. For example, if you defined a reverse type by inheriting from this class, get_application_type gets an argument of type <a,b>, it should return <b,a> .

get_application_type(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]

This method returns the resulting type when this type is applied as a function to an argument of the given type.

matches(self, other) → bool[source]
resolve(self, other: nltk.sem.logic.Type) → Union[nltk.sem.logic.Type, NoneType][source]

This method is central to type inference and checking. When a variable’s type is being checked, we compare what we know of its type against what is expected of its type by its context. The expectation is provided as other. We make sure that there are no contradictions between this type and other, and return an updated type which may be more specific than the original type.

For example, say this type is of the function variable F in F(cell), and we start out with <?, d> (that is, it takes any type and returns d ). Now we have already resolved cell to be of type e . Then resolve gets called with other = <e, ?> , because we know F is a function that took a constant of type e . When we resolve <e, ?> against <?, d> , there will not be a contradiction, because any type can be successfully resolved against ? . Finally we return <e, d> as the resolved type.

As a counter example, if we are trying to resolve <?, d> against <?, e> , the resolution fails, and in that case, this method returns None .

Note that a successful resolution does not imply equality of types because of one of them may be ANY_TYPE, and so in the subclasses of this type, we explicitly resolve in both directions.

str(self)[source]
substitute_any_type(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]

Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a NotImplementedError to be sure that subclasses think about what the right thing to do here is, and do it correctly.

class allennlp.semparse.type_declarations.type_declaration.TypedConstantExpression(variable, default_type: nltk.sem.logic.Type)[source]

Bases: nltk.sem.logic.ConstantExpression

NLTK assumes all constants are of type EntityType (e) by default. We define this new class where we can pass a default type to the constructor and use that in the _set_type method.

class allennlp.semparse.type_declarations.type_declaration.UnaryOpType(type_: nltk.sem.logic.BasicType = ?, allowed_substitutions: Set[nltk.sem.logic.BasicType] = None, signature: str = '<#1, #1>')[source]

Bases: allennlp.semparse.type_declarations.type_declaration.PlaceholderType

UnaryOpType is a kind of PlaceholderType that takes an argument of any type and returns an expression of the same type. identity is an example of this kind of function. The type signature of UnaryOpType is <#1, #1>.

Parameters
allowed_substitutionsSet[BasicType], optional (default=None)

If given, this sets restrictions on the types that can be substituted. That is, say you have a unary operation that is only permitted for numbers and dates, you can pass those in here, and we will only consider those types when calling substitute_any_type(). If this is None, all basic types are allowed.

signaturestr, optional (default=’<#1,#1>’)

The signature of the operation is what will appear in action sequences that include this type. The default value is suitable for functions that apply to any type. If you have a restricted set of allowed substitutions, you likely want to change the type signature to reflect that.

get_application_type(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]

This method returns the resulting type when this type is applied as a function to an argument of the given type.

resolve(self, other) → Union[nltk.sem.logic.Type, NoneType][source]

See PlaceholderType.resolve

substitute_any_type(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]

Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a NotImplementedError to be sure that subclasses think about what the right thing to do here is, and do it correctly.

allennlp.semparse.type_declarations.type_declaration.get_valid_actions(name_mapping: Dict[str, str], type_signatures: Dict[str, nltk.sem.logic.Type], basic_types: Set[nltk.sem.logic.Type], multi_match_mapping: Dict[nltk.sem.logic.Type, List[nltk.sem.logic.Type]] = None, valid_starting_types: Set[nltk.sem.logic.Type] = None, num_nested_lambdas: int = 0) → Dict[str, List[str]][source]

Generates all the valid actions starting from each non-terminal. For terminals of a specific type, we simply add a production from the type to the terminal. For all terminal functions, we additionally add a rule that allows their return type to be generated from an application of the function. For example, the function <e,<r,<d,r>>>, which takes three arguments and returns an r would generate a the production rule r -> [<e,<r,<d,r>>>, e, r, d].

For functions that do not contain ANY_TYPE or placeholder types, this is straight-forward. When there are ANY_TYPES or placeholders, we substitute the ANY_TYPE with all possible basic types, and then produce a similar rule. For example, the identity function, with type <#1,#1> and basic types e and r, would produce the rules e -> [<#1,#1>, e] and r -> [<#1,#1>, r].

We additionally add a valid action from the start symbol to all valid_starting_types.

Parameters
name_mappingDict[str, str]

The mapping of names that appear in your logical form languages to their aliases for NLTK. If you are getting all valid actions for a type declaration, this can be the COMMON_NAME_MAPPING.

type_signaturesDict[str, Type]

The mapping from name aliases to their types. If you are getting all valid actions for a type declaration, this can be the COMMON_TYPE_SIGNATURE.

basic_typesSet[Type]

Set of all basic types in the type declaration.

multi_match_mappingDict[Type, List[Type]] (optional)

A mapping from MultiMatchNamedBasicTypes to the types they can match. This may be different from the type’s types_to_match field based on the context. While building action sequences that lead to complex types with MultiMatchNamedBasicTypes, if a type does not occur in this mapping, the default set of types_to_match for that type will be used.

valid_starting_typesSet[Type], optional

These are the valid starting types for your grammar; e.g., what types are we allowed to parse expressions into? We will add a “START -> TYPE” rule for each of these types. If this is None, we default to using basic_types.

num_nested_lambdasint (optional)

Does the language used permit lambda expressions? And if so, how many nested lambdas do we need to worry about? We’ll add rules like “<r,d> -> [‘lambda x’, d]” for all complex types, where the variable is determined by the number of nestings. We currently only permit up to three levels of nesting, just for ease of implementation.

allennlp.semparse.type_declarations.type_declaration.is_nonterminal(production: str) → bool[source]
allennlp.semparse.type_declarations.type_declaration.substitute_any_type(type_: nltk.sem.logic.Type, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]

Takes a type and a set of basic types, and substitutes all instances of ANY_TYPE with all possible basic types and returns a list with all possible combinations. Note that this substitution is unconstrained. That is, If you have a type with placeholders, <#1,#1> for example, this may substitute the placeholders with different basic types. In that case, you’d want to use _substitute_placeholder_type instead.

Defines all the types in the QuaRel domain.

class allennlp.semparse.type_declarations.quarel_type_declaration.QuarelTypeDeclaration(syntax: str)[source]

Bases: object