Skip to content

language_model

allennlp_models.lm.modules.token_embedders.language_model

[SOURCE]


LanguageModelTokenEmbedder#

@TokenEmbedder.register("language_model_token_embedder")
class LanguageModelTokenEmbedder(TokenEmbedder):
 | def __init__(
 |     self,
 |     archive_file: str,
 |     dropout: float = None,
 |     bos_eos_tokens: Tuple[str, str] = ("<S>", "</S>"),
 |     remove_bos_eos: bool = True,
 |     requires_grad: bool = False
 | ) -> None

Compute a single layer of representations from a (optionally bidirectional) language model. This is done by computing a learned scalar average of the layers from the LM. Typically the LM's weights will be fixed, but they can be fine tuned by setting requires_grad.

Parameters

  • archive_file : str
    An archive file, typically model.tar.gz, from a LanguageModel. The contextualizer used by the LM must satisfy two requirements:

    1. It must have a num_layers field.
    2. It must take a boolean return_all_layers parameter in its constructor.

    See BidirectionalLanguageModelTransformer for their definitions.

  • dropout : float, optional
    The dropout value to be applied to the representations.

  • bos_eos_tokens : Tuple[str, str], optional (default = ("<S>", "</S>"))
    These will be indexed and placed around the indexed tokens. Necessary if the language model was trained with them, but they were injected external to an indexer.
  • remove_bos_eos : bool, optional (default = True)
    Typically the provided token indexes will be augmented with begin-sentence and end-sentence tokens. (Alternatively, you can pass bos_eos_tokens.) If this flag is True the corresponding embeddings will be removed from the return values.

    Warning: This only removes a single start and single end token! - requires_grad : bool, optional (default = False)
    If True, compute gradient of bidirectional language model parameters for fine tuning.

get_output_dim#

class LanguageModelTokenEmbedder(TokenEmbedder):
 | ...
 | def get_output_dim(self) -> int

forward#

class LanguageModelTokenEmbedder(TokenEmbedder):
 | ...
 | def forward(self, tokens: torch.Tensor) -> Dict[str, torch.Tensor]

Parameters

  • tokens : torch.Tensor
    Shape (batch_size, timesteps, ...) of token ids representing the current batch. These must have been produced using the same indexer the LM was trained on.

Returns

  • The bidirectional language model representations for the input sequence, shape
  • (batch_size, timesteps, embedding_dim)