allennlp.modules.token_embedders.pretrained_transformer_embedder#

PretrainedTransformerEmbedder#

PretrainedTransformerEmbedder(self, model_name:str, max_length:int=None) -> None

Uses a pretrained model from transformers as a TokenEmbedder.

Registered as a TokenEmbedder with name "pretrained_transformer".

Parameters

  • model_name : str The name of the transformers model to use. Should be the same as the corresponding PretrainedTransformerIndexer.
  • max_length : int, optional (default = None) If positive, folds input token IDs into multiple segments of this length, pass them through the transformer model independently, and concatenate the final representations. Should be set to the same value as the max_length option on the PretrainedTransformerIndexer.

forward#

PretrainedTransformerEmbedder.forward(
    self,
    token_ids: torch.LongTensor,
    mask: torch.BoolTensor,
    type_ids: Optional[torch.LongTensor] = None,
    segment_concat_mask: Optional[torch.BoolTensor] = None,
) -> torch.Tensor

Parameters

  • token_ids: torch.LongTensor
  • Shape: [ batch_size, num_wordpieces if max_length is None else num_segment_concat_wordpieces ]. num_segment_concat_wordpieces is num_wordpieces plus special tokens inserted in the
  • middle, e.g. the length of: "[CLS] A B C [SEP] [CLS] D E F [SEP]" (see indexer logic).
  • mask: torch.BoolTensor
  • Shape: [batch_size, num_wordpieces].
  • type_ids: Optional[torch.LongTensor]
  • Shape: [ batch_size, num_wordpieces if max_length is None else num_segment_concat_wordpieces ].
  • segment_concat_mask: Optional[torch.BoolTensor]
  • Shape: [batch_size, num_segment_concat_wordpieces].

Returns:

Shape: [batch_size, num_wordpieces, embedding_size].

get_output_dim#

PretrainedTransformerEmbedder.get_output_dim(self)

Returns the final output dimension that this TokenEmbedder uses to represent each token. This is not the shape of the returned tensor, but the last element of that shape.