text_field_embedder
allennlp.modules.text_field_embedders.text_field_embedder
TextFieldEmbedder#
class TextFieldEmbedder(torch.nn.Module, Registrable)
A TextFieldEmbedder
is a Module
that takes as input the
DataArray
produced by a TextField
and
returns as output an embedded representation of the tokens in that field.
The DataArrays
produced by TextFields
are dictionaries with named representations, like
"words" and "characters". When you create a TextField
, you pass in a dictionary of
TokenIndexer
objects, telling the field how exactly the
tokens in the field should be represented. This class changes the type signature of Module.forward
,
restricting TextFieldEmbedders
to take inputs corresponding to a single TextField
, which is
a dictionary of tensors with the same names as were passed to the TextField
.
We also add a method to the basic Module
API: get_output_dim()
. You might need this
if you want to construct a Linear
layer using the output of this embedder, for instance.
default_implementation#
class TextFieldEmbedder(torch.nn.Module, Registrable):
| ...
| default_implementation = "basic"
forward#
class TextFieldEmbedder(torch.nn.Module, Registrable):
| ...
| def forward(
| self,
| text_field_input: TextFieldTensors,
| num_wrapping_dims: int = 0,
| **kwargs
| ) -> torch.Tensor
Parameters
- text_field_input :
TextFieldTensors
A dictionary that was the output of a call toTextField.as_tensor
. Each tensor in here is assumed to have a shape roughly similar to(batch_size, sequence_length)
(perhaps with an extra trailing dimension for the characters in each token). - num_wrapping_dims :
int
, optional (default =0
)
If you have aListField[TextField]
that created thetext_field_input
, you'll end up with tensors of shape(batch_size, wrapping_dim1, wrapping_dim2, ..., sequence_length)
. This parameter tells us how many wrapping dimensions there are, so that we can correctlyTimeDistribute
the embedding of each named representation.
get_output_dim#
class TextFieldEmbedder(torch.nn.Module, Registrable):
| ...
| def get_output_dim(self) -> int
Returns the dimension of the vector representing each token in the output of this
TextFieldEmbedder
. This is not the shape of the returned tensor, but the last element
of that shape.