pretrained_transformer_backbone
allennlp.modules.backbones.pretrained_transformer_backbone
PretrainedTransformerBackbone¶
@Backbone.register("pretrained_transformer")
class PretrainedTransformerBackbone(Backbone):
| def __init__(
| self,
| vocab: Vocabulary,
| model_name: str,
| *, max_length: int = None,
| *, sub_module: str = None,
| *, train_parameters: bool = True,
| *, last_layer_only: bool = True,
| *, override_weights_file: Optional[str] = None,
| *, override_weights_strip_prefix: Optional[str] = None,
| *, tokenizer_kwargs: Optional[Dict[str, Any]] = None,
| *, transformer_kwargs: Optional[Dict[str, Any]] = None,
| *, output_token_strings: bool = True,
| *, vocab_namespace: str = "tags"
| ) -> None
Uses a pretrained model from transformers
as a Backbone
.
This class passes most of its arguments to a PretrainedTransformerEmbedder
, which it uses to
implement the underlying encoding logic (we duplicate the arguments here instead of taking an
Embedder
as a constructor argument just to simplify the user-facing API).
Registered as a Backbone
with name "pretrained_transformer".
Parameters¶
- vocab :
Vocabulary
Necessary for converting input ids to strings inmake_output_human_readable
. If you setoutput_token_strings
toFalse
, or if you never callmake_output_human_readable
, then this will not be used and can be safely set toNone
. - model_name :
str
The name of thetransformers
model to use. Should be the same as the correspondingPretrainedTransformerIndexer
. - max_length :
int
, optional (default =None
)
If positive, folds input token IDs into multiple segments of this length, pass them through the transformer model independently, and concatenate the final representations. Should be set to the same value as themax_length
option on thePretrainedTransformerIndexer
. - sub_module :
str
, optional (default =None
)
The name of a submodule of the transformer to be used as the embedder. Some transformers naturally act as embedders such as BERT. However, other models consist of encoder and decoder, in which case we just want to use the encoder. - train_parameters :
bool
, optional (default =True
)
If this isTrue
, the transformer weights get updated during training. - last_layer_only :
bool
, optional (default =True
)
WhenTrue
(the default), only the final layer of the pretrained transformer is taken for the embeddings. But if set toFalse
, a scalar mix of all of the layers is used. - tokenizer_kwargs :
Dict[str, Any]
, optional (default =None
)
Dictionary with additional arguments forAutoTokenizer.from_pretrained
. - transformer_kwargs :
Dict[str, Any]
, optional (default =None
)
Dictionary with additional arguments forAutoModel.from_pretrained
. - output_token_strings :
bool
, optional (default =True
)
IfTrue
, we will add the input token ids to the output dictionary inforward
(with key "token_ids"), and convert them to strings inmake_output_human_readable
(with key "tokens"). This is necessary for certain demo functionality, and it adds only a trivial amount of computation if you are not using a demo. - vocab_namespace :
str
, optional (default ="tags"
)
The namespace to use in conjunction with theVocabulary
above. We use a somewhat confusing default of "tags" here, to match what is done inPretrainedTransformerIndexer
.
forward¶
class PretrainedTransformerBackbone(Backbone):
| ...
| def forward(self, text: TextFieldTensors) -> Dict[str, torch.Tensor]
make_output_human_readable¶
class PretrainedTransformerBackbone(Backbone):
| ...
| def make_output_human_readable(
| self,
| output_dict: Dict[str, torch.Tensor]
| ) -> Dict[str, torch.Tensor]