pretrained_transformer_backbone
allennlp.modules.backbones.pretrained_transformer_backbone
PretrainedTransformerBackbone¶
@Backbone.register("pretrained_transformer")
class PretrainedTransformerBackbone(Backbone):
| def __init__(
| self,
| vocab: Vocabulary,
| model_name: str,
| *, max_length: int = None,
| *, sub_module: str = None,
| *, train_parameters: bool = True,
| *, last_layer_only: bool = True,
| *, override_weights_file: Optional[str] = None,
| *, override_weights_strip_prefix: Optional[str] = None,
| *, tokenizer_kwargs: Optional[Dict[str, Any]] = None,
| *, transformer_kwargs: Optional[Dict[str, Any]] = None,
| *, output_token_strings: bool = True,
| *, vocab_namespace: str = "tags"
| ) -> None
Uses a pretrained model from transformers as a Backbone.
This class passes most of its arguments to a PretrainedTransformerEmbedder, which it uses to
implement the underlying encoding logic (we duplicate the arguments here instead of taking an
Embedder as a constructor argument just to simplify the user-facing API).
Registered as a Backbone with name "pretrained_transformer".
Parameters¶
- vocab :
Vocabulary
Necessary for converting input ids to strings inmake_output_human_readable. If you setoutput_token_stringstoFalse, or if you never callmake_output_human_readable, then this will not be used and can be safely set toNone. - model_name :
str
The name of thetransformersmodel to use. Should be the same as the correspondingPretrainedTransformerIndexer. - max_length :
int, optional (default =None)
If positive, folds input token IDs into multiple segments of this length, pass them through the transformer model independently, and concatenate the final representations. Should be set to the same value as themax_lengthoption on thePretrainedTransformerIndexer. - sub_module :
str, optional (default =None)
The name of a submodule of the transformer to be used as the embedder. Some transformers naturally act as embedders such as BERT. However, other models consist of encoder and decoder, in which case we just want to use the encoder. - train_parameters :
bool, optional (default =True)
If this isTrue, the transformer weights get updated during training. - last_layer_only :
bool, optional (default =True)
WhenTrue(the default), only the final layer of the pretrained transformer is taken for the embeddings. But if set toFalse, a scalar mix of all of the layers is used. - tokenizer_kwargs :
Dict[str, Any], optional (default =None)
Dictionary with additional arguments forAutoTokenizer.from_pretrained. - transformer_kwargs :
Dict[str, Any], optional (default =None)
Dictionary with additional arguments forAutoModel.from_pretrained. - output_token_strings :
bool, optional (default =True)
IfTrue, we will add the input token ids to the output dictionary inforward(with key "token_ids"), and convert them to strings inmake_output_human_readable(with key "tokens"). This is necessary for certain demo functionality, and it adds only a trivial amount of computation if you are not using a demo. - vocab_namespace :
str, optional (default ="tags")
The namespace to use in conjunction with theVocabularyabove. We use a somewhat confusing default of "tags" here, to match what is done inPretrainedTransformerIndexer.
forward¶
class PretrainedTransformerBackbone(Backbone):
| ...
| def forward(self, text: TextFieldTensors) -> Dict[str, torch.Tensor]
make_output_human_readable¶
class PretrainedTransformerBackbone(Backbone):
| ...
| def make_output_human_readable(
| self,
| output_dict: Dict[str, torch.Tensor]
| ) -> Dict[str, torch.Tensor]