allennlp.semparse.type_declarations¶
This module defines some classes that are generally useful for defining a type system for a new
domain. We inherit the type logic in nltk.sem.logic
and add some functionality on top of it
here. There are two main improvements:
1) Firstly, we allow defining multiple basic types with their own names (see NamedBasicType
).
2) Secondly, we allow defining function types that have placeholders in them (see
PlaceholderType
).
We also extend NLTK’s LogicParser
to define a DynamicTypeLogicParser
that knows how to deal
with the two improvements above.
-
class
allennlp.semparse.type_declarations.type_declaration.
BinaryOpType
(type_: nltk.sem.logic.BasicType = ?, allowed_substitutions: Set[nltk.sem.logic.BasicType] = None, signature: str = '<#1, <#1, #1>>')[source]¶ Bases:
allennlp.semparse.type_declarations.type_declaration.PlaceholderType
BinaryOpType
is a function that takes two arguments of the same type and returns an argument of that type.+
,-
,and
andor
are examples of this kind of function. The type signature ofBinaryOpType
is<#1,<#1,#1>>
.- Parameters
- allowed_substitutions
Set[BasicType]
, optional (default=None) If given, this sets restrictions on the types that can be substituted. That is, say you have a unary operation that is only permitted for numbers and dates, you can pass those in here, and we will only consider those types when calling
substitute_any_type()
. If this isNone
, all basic types are allowed.- signature
str
, optional (default=’<#1,<#1,#1>>’) The signature of the operation is what will appear in action sequences that include this type. The default value is suitable for functions that apply to any type. If you have a restricted set of allowed substitutions, you likely want to change the type signature to reflect that.
- allowed_substitutions
-
get_application_type
(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]¶ This method returns the resulting type when this type is applied as a function to an argument of the given type.
-
resolve
(self, other: nltk.sem.logic.Type) → Union[nltk.sem.logic.Type, NoneType][source]¶ See
PlaceholderType.resolve
-
substitute_any_type
(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]¶ Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a
NotImplementedError
to be sure that subclasses think about what the right thing to do here is, and do it correctly.
-
class
allennlp.semparse.type_declarations.type_declaration.
ComplexType
(first, second)[source]¶ Bases:
nltk.sem.logic.ComplexType
In NLTK, a
ComplexType
is a function. These functions are curried, so if you need multiple arguments for your function you nestComplexTypes
. That currying makes things difficult for us, and we mitigate the problems by addingreturn_type
andargument_type
functions toComplexType
.-
argument_types
(self) → List[nltk.sem.logic.Type][source]¶ Gives the types of all arguments to this function. For functions returning a basic type, we grab all
.first
types until.second
is no longer aComplexType
. That logic is implemented here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.
-
return_type
(self) → nltk.sem.logic.Type[source]¶ Gives the final return type for this function. If the function takes a single argument, this is just
self.second
. If the function takes multiple arguments and returns a basic type, this should be the final.second
after following all complex types. That is the implementation here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.
-
-
class
allennlp.semparse.type_declarations.type_declaration.
DynamicTypeApplicationExpression
(function: nltk.sem.logic.Expression, argument: nltk.sem.logic.Expression, variables_with_placeholders: Set[str])[source]¶ Bases:
nltk.sem.logic.ApplicationExpression
NLTK’s
ApplicationExpression
(which represents function applications like P(x)) has two limitations, which we overcome by inheriting fromApplicationExpression
and overriding two methods.Firstly,
ApplicationExpression
does not handle the case where P’s type involves placeholders (R, V, !=, etc.), which are special cases because their return types depend on the type of their arguments (x). We override the propertytype
to redefine the type of the application.Secondly, NLTK’s variables only bind to entities, and thus the variable types are ‘e’ by default. We get around this issue by replacing x with X, whose initial type is ANY_TYPE, and later gets resolved based on the type signature of the function whose scope the variable appears in. This variable binding operation is implemented by overriding
_set_type
below.-
property
type
¶
-
property
-
class
allennlp.semparse.type_declarations.type_declaration.
DynamicTypeLogicParser
(type_check: bool = True, constant_type_prefixes: Dict[str, nltk.sem.logic.BasicType] = None, type_signatures: Dict[str, nltk.sem.logic.Type] = None)[source]¶ Bases:
nltk.sem.logic.LogicParser
DynamicTypeLogicParser
is aLogicParser
that can deal withNamedBasicType
andPlaceholderType
appropriately. Our extension here does two things differently.Firstly, we should handle constants of different types. We do this by passing a dict of format
{name_prefix: type}
to the constructor. For example, your domain has entities of types unicorns and elves, and you have an entity “Phil” of type unicorn, and “Bob” of type “elf”. The names of the two entities should then be “unicorn:phil” and “elf:bob” respectively.Secondly, since we defined a new kind of
ApplicationExpression
above, theLogicParser
should be able to create this new kind of expression.
-
class
allennlp.semparse.type_declarations.type_declaration.
HigherOrderType
(num_arguments: int, first: nltk.sem.logic.Type, second: nltk.sem.logic.Type)[source]¶ Bases:
allennlp.semparse.type_declarations.type_declaration.ComplexType
A higher-order function is a
ComplexType
that returns functions. We just overridereturn_type
andargument_types
to make sure that these types are correct.- Parameters
- num_arguments
int
How many arguments this function takes before returning a function. We’ll go through this many levels of nested
ComplexTypes
before returning the final.second
as our return type.- first
Type
Passed to NLTK’s ComplexType.
- second
Type
Passed to NLTK’s ComplexType.
- num_arguments
-
argument_types
(self) → List[nltk.sem.logic.Type][source]¶ Gives the types of all arguments to this function. For functions returning a basic type, we grab all
.first
types until.second
is no longer aComplexType
. That logic is implemented here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.
-
return_type
(self) → nltk.sem.logic.Type[source]¶ Gives the final return type for this function. If the function takes a single argument, this is just
self.second
. If the function takes multiple arguments and returns a basic type, this should be the final.second
after following all complex types. That is the implementation here in the base class. If you have a higher-order function that returns a function itself, you need to override this method.
-
class
allennlp.semparse.type_declarations.type_declaration.
MultiMatchNamedBasicType
(string_rep, types_to_match: List[nltk.sem.logic.BasicType])[source]¶ Bases:
allennlp.semparse.type_declarations.type_declaration.NamedBasicType
A
NamedBasicType
that matches with any type within a list ofBasicTypes
that it takes as an additional argument during instantiation. We just override thematches
method inBasicType
to match against any of the types given by the list.- Parameters
- string_rep
str
String representation of the type, passed to super class.
- types_to_match
List[BasicType]
List of types that this type should match with.
- string_rep
-
class
allennlp.semparse.type_declarations.type_declaration.
NameMapper
(language_has_lambda: bool = False, alias_prefix: str = 'F')[source]¶ Bases:
object
The
LogicParser
we use has some naming conventions for functions (i.e. they should start with an upper case letter, and the remaining characters can only be digits). This means that we have to internally represent functions with unintuitive names. This class will automatically give unique names following the convention, and populate central mappings with these names. If for some reason you need to manually define the alias, you can do so by passing an alias to map_name_with_signature.- Parameters
- language_has_lambda
bool
(optional, default=False) If your language has lambda functions, the word “lambda” needs to be in the name mapping, mapped to the alias “”. NLTK understands this symbol, and it doesn’t need a type signature for it. Setting this flag to True adds the mapping to name_mapping.
- alias_prefix
str
(optional, default=”F”) The one letter prefix used for all aliases. You do not need to specify it if you have only instance of this class for you language. If not, you can specify a different prefix for each name mapping you use for your language.
- language_has_lambda
-
class
allennlp.semparse.type_declarations.type_declaration.
NamedBasicType
(string_rep)[source]¶ Bases:
nltk.sem.logic.BasicType
A
BasicType
that also takes the name of the type as an argument to its constructor. Type resolution uses the output of__str__
as well, so basic types with different representations do not resolve against each other.- Parameters
- string_rep
str
String representation of the type.
- string_rep
-
class
allennlp.semparse.type_declarations.type_declaration.
PlaceholderType
(first, second)[source]¶ Bases:
allennlp.semparse.type_declarations.type_declaration.ComplexType
PlaceholderType
is aComplexType
that involves placeholders, and thus its type resolution is context sensitive. This is an abstract class for all placeholder types like reverse, and, or, argmax, etc.Note that ANY_TYPE in NLTK’s type system doesn’t work like a wild card. Once the type of a variable gets resolved to a specific type, NLTK changes the type of that variable to that specific type. Hence, what NLTK calls “ANY_TYPE”, is essentially a “yet-to-be-decided” type. This is a problem because we may want the same variable to bind to different types within a logical form, and using ANY_TYPE for this purpose will cause a resolution failure. For example the count function may apply to both rows and cells in the same logical form, and making count of type
ComplexType(ANY_TYPE, DATE_NUM_TYPE)
will cause a resolution error. This class lets you defineComplexType
s with placeholders that are actually wild cards.The subclasses of this abstract class need to do three things 1) Override the property
_signature
to define the type signature (this is just the signature’s string representation and will not affect type inference or checking). You will see this signature in action sequences. 2) Overrideresolve
to resolve the type appropriately (see the docstring inresolve
for more information). 3) Overrideget_application_type
which returns the return type when this type is applied as a function to an argument of a specified type. For example, if you defined a reverse type by inheriting from this class,get_application_type
gets an argument of type<a,b>
, it should return<b,a>
.-
get_application_type
(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]¶ This method returns the resulting type when this type is applied as a function to an argument of the given type.
-
resolve
(self, other: nltk.sem.logic.Type) → Union[nltk.sem.logic.Type, NoneType][source]¶ This method is central to type inference and checking. When a variable’s type is being checked, we compare what we know of its type against what is expected of its type by its context. The expectation is provided as
other
. We make sure that there are no contradictions between this type and other, and return an updated type which may be more specific than the original type.For example, say this type is of the function variable F in F(cell), and we start out with
<?, d>
(that is, it takes any type and returnsd
). Now we have already resolved cell to be of typee
. Thenresolve
gets called withother = <e, ?>
, because we know F is a function that took a constant of typee
. When we resolve<e, ?>
against<?, d>
, there will not be a contradiction, because any type can be successfully resolved against?
. Finally we return<e, d>
as the resolved type.As a counter example, if we are trying to resolve
<?, d>
against<?, e>
, the resolution fails, and in that case, this method returnsNone
.Note that a successful resolution does not imply equality of types because of one of them may be ANY_TYPE, and so in the subclasses of this type, we explicitly resolve in both directions.
-
substitute_any_type
(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]¶ Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a
NotImplementedError
to be sure that subclasses think about what the right thing to do here is, and do it correctly.
-
-
class
allennlp.semparse.type_declarations.type_declaration.
TypedConstantExpression
(variable, default_type: nltk.sem.logic.Type)[source]¶ Bases:
nltk.sem.logic.ConstantExpression
NLTK assumes all constants are of type
EntityType
(e) by default. We define this new class where we can pass a default type to the constructor and use that in the_set_type
method.
-
class
allennlp.semparse.type_declarations.type_declaration.
UnaryOpType
(type_: nltk.sem.logic.BasicType = ?, allowed_substitutions: Set[nltk.sem.logic.BasicType] = None, signature: str = '<#1, #1>')[source]¶ Bases:
allennlp.semparse.type_declarations.type_declaration.PlaceholderType
UnaryOpType
is a kind ofPlaceholderType
that takes an argument of any type and returns an expression of the same type.identity
is an example of this kind of function. The type signature ofUnaryOpType
is <#1, #1>.- Parameters
- allowed_substitutions
Set[BasicType]
, optional (default=None) If given, this sets restrictions on the types that can be substituted. That is, say you have a unary operation that is only permitted for numbers and dates, you can pass those in here, and we will only consider those types when calling
substitute_any_type()
. If this isNone
, all basic types are allowed.- signature
str
, optional (default=’<#1,#1>’) The signature of the operation is what will appear in action sequences that include this type. The default value is suitable for functions that apply to any type. If you have a restricted set of allowed substitutions, you likely want to change the type signature to reflect that.
- allowed_substitutions
-
get_application_type
(self, argument_type: nltk.sem.logic.Type) → nltk.sem.logic.Type[source]¶ This method returns the resulting type when this type is applied as a function to an argument of the given type.
-
substitute_any_type
(self, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]¶ Placeholders mess with substitutions, so even though this method is implemented in the superclass, we override it here with a
NotImplementedError
to be sure that subclasses think about what the right thing to do here is, and do it correctly.
-
allennlp.semparse.type_declarations.type_declaration.
get_valid_actions
(name_mapping: Dict[str, str], type_signatures: Dict[str, nltk.sem.logic.Type], basic_types: Set[nltk.sem.logic.Type], multi_match_mapping: Dict[nltk.sem.logic.Type, List[nltk.sem.logic.Type]] = None, valid_starting_types: Set[nltk.sem.logic.Type] = None, num_nested_lambdas: int = 0) → Dict[str, List[str]][source]¶ Generates all the valid actions starting from each non-terminal. For terminals of a specific type, we simply add a production from the type to the terminal. For all terminal functions, we additionally add a rule that allows their return type to be generated from an application of the function. For example, the function
<e,<r,<d,r>>>
, which takes three arguments and returns anr
would generate a the production ruler -> [<e,<r,<d,r>>>, e, r, d]
.For functions that do not contain ANY_TYPE or placeholder types, this is straight-forward. When there are ANY_TYPES or placeholders, we substitute the ANY_TYPE with all possible basic types, and then produce a similar rule. For example, the identity function, with type
<#1,#1>
and basic typese
andr
, would produce the rulese -> [<#1,#1>, e]
andr -> [<#1,#1>, r]
.We additionally add a valid action from the start symbol to all
valid_starting_types
.- Parameters
- name_mapping
Dict[str, str]
The mapping of names that appear in your logical form languages to their aliases for NLTK. If you are getting all valid actions for a type declaration, this can be the
COMMON_NAME_MAPPING
.- type_signatures
Dict[str, Type]
The mapping from name aliases to their types. If you are getting all valid actions for a type declaration, this can be the
COMMON_TYPE_SIGNATURE
.- basic_types
Set[Type]
Set of all basic types in the type declaration.
- multi_match_mapping
Dict[Type, List[Type]]
(optional) A mapping from MultiMatchNamedBasicTypes to the types they can match. This may be different from the type’s
types_to_match
field based on the context. While building action sequences that lead to complex types withMultiMatchNamedBasicTypes
, if a type does not occur in this mapping, the default set oftypes_to_match
for that type will be used.- valid_starting_types
Set[Type]
, optional These are the valid starting types for your grammar; e.g., what types are we allowed to parse expressions into? We will add a “START -> TYPE” rule for each of these types. If this is
None
, we default to usingbasic_types
.- num_nested_lambdas
int
(optional) Does the language used permit lambda expressions? And if so, how many nested lambdas do we need to worry about? We’ll add rules like “<r,d> -> [‘lambda x’, d]” for all complex types, where the variable is determined by the number of nestings. We currently only permit up to three levels of nesting, just for ease of implementation.
- name_mapping
-
allennlp.semparse.type_declarations.type_declaration.
is_nonterminal
(production: str) → bool[source]¶
-
allennlp.semparse.type_declarations.type_declaration.
substitute_any_type
(type_: nltk.sem.logic.Type, basic_types: Set[nltk.sem.logic.BasicType]) → List[nltk.sem.logic.Type][source]¶ Takes a type and a set of basic types, and substitutes all instances of ANY_TYPE with all possible basic types and returns a list with all possible combinations. Note that this substitution is unconstrained. That is, If you have a type with placeholders, <#1,#1> for example, this may substitute the placeholders with different basic types. In that case, you’d want to use
_substitute_placeholder_type
instead.
Defines all the types in the QuaRel domain.