language_model
allennlp_models.lm.modules.token_embedders.language_model
LanguageModelTokenEmbedder#
@TokenEmbedder.register("language_model_token_embedder")
class LanguageModelTokenEmbedder(TokenEmbedder):
| def __init__(
| self,
| archive_file: str,
| dropout: float = None,
| bos_eos_tokens: Tuple[str, str] = ("<S>", "</S>"),
| remove_bos_eos: bool = True,
| requires_grad: bool = False
| ) -> None
Compute a single layer of representations from a (optionally bidirectional)
language model. This is done by computing a learned scalar
average of the layers from the LM. Typically the LM's weights
will be fixed, but they can be fine tuned by setting requires_grad
.
Parameters¶
-
archive_file :
str
An archive file, typically model.tar.gz, from a LanguageModel. The contextualizer used by the LM must satisfy two requirements:- It must have a num_layers field.
- It must take a boolean return_all_layers parameter in its constructor.
See BidirectionalLanguageModelTransformer for their definitions.
-
dropout :
float
, optional
The dropout value to be applied to the representations. - bos_eos_tokens :
Tuple[str, str]
, optional (default =("<S>", "</S>")
)
These will be indexed and placed around the indexed tokens. Necessary if the language model was trained with them, but they were injected external to an indexer. -
remove_bos_eos :
bool
, optional (default =True
)
Typically the provided token indexes will be augmented with begin-sentence and end-sentence tokens. (Alternatively, you can pass bos_eos_tokens.) If this flag is True the corresponding embeddings will be removed from the return values.Warning: This only removes a single start and single end token! - requires_grad :
bool
, optional (default =False
)
If True, compute gradient of bidirectional language model parameters for fine tuning.
get_output_dim#
class LanguageModelTokenEmbedder(TokenEmbedder):
| ...
| def get_output_dim(self) -> int
forward#
class LanguageModelTokenEmbedder(TokenEmbedder):
| ...
| def forward(self, tokens: torch.Tensor) -> Dict[str, torch.Tensor]
Parameters¶
- tokens :
torch.Tensor
Shape(batch_size, timesteps, ...)
of token ids representing the current batch. These must have been produced using the same indexer the LM was trained on.
Returns¶
- The bidirectional language model representations for the input sequence, shape
(batch_size, timesteps, embedding_dim)