masked_language_model
allennlp_models.lm.models.masked_language_model
MaskedLanguageModel#
@Model.register("masked_language_model")
class MaskedLanguageModel(Model):
| def __init__(
| self,
| vocab: Vocabulary,
| text_field_embedder: TextFieldEmbedder,
| language_model_head: LanguageModelHead,
| contextualizer: Seq2SeqEncoder = None,
| target_namespace: str = "bert",
| dropout: float = 0.0,
| initializer: InitializerApplicator = None,
| **kwargs
| ) -> None
The MaskedLanguageModel
embeds some input tokens (including some which are masked),
contextualizes them, then predicts targets for the masked tokens, computing a loss against
known targets.
NOTE: This was developed for use in a demo, not for training. It's possible that it will still
work for training a masked LM, but it is very likely that some other code would be much more
efficient for that. This does
compute correct gradients of the loss, because we use that in
our demo, so in principle it should be able to train a model, we just don't necessarily endorse
that use.
Parameters¶
- vocab :
Vocabulary
- text_field_embedder :
TextFieldEmbedder
Used to embed the indexed tokens we get inforward
. - language_model_head :
LanguageModelHead
Thetorch.nn.Module
that goes from the hidden states output by the contextualizer to logits over some output vocabulary. - contextualizer :
Seq2SeqEncoder
, optional (default =None
)
Used to "contextualize" the embeddings. This is optional because the contextualization might actually be done in the text field embedder. - target_namespace :
str
, optional (default ='bert'
)
Namespace to use to convert predicted token ids to strings inModel.make_output_human_readable
. - dropout :
float
, optional (default =0.0
)
If specified, dropout is applied to the contextualized embeddings before computation of the softmax. The contextualized embeddings themselves are returned without dropout.
forward#
class MaskedLanguageModel(Model):
| ...
| def forward(
| self,
| tokens: TextFieldTensors,
| mask_positions: torch.BoolTensor,
| target_ids: TextFieldTensors = None
| ) -> Dict[str, torch.Tensor]
Parameters¶
- tokens :
TextFieldTensors
The output ofTextField.as_tensor()
for a batch of sentences. - mask_positions :
torch.LongTensor
The positions intokens
that correspond to [MASK] tokens that we should try to fill in. Shape should be (batch_size, num_masks). - target_ids :
TextFieldTensors
This is a list of token ids that correspond to the mask positions we're trying to fill. It is the output of aTextField
, purely for convenience, so we can handle wordpiece tokenizers and such without having to do crazy things in the dataset reader. We assume that there is exactly one entry in the dictionary, and that it has a shape identical tomask_positions
- one target token per mask position.
get_metrics#
class MaskedLanguageModel(Model):
| ...
| def get_metrics(self, reset: bool = False)
make_output_human_readable#
class MaskedLanguageModel(Model):
| ...
| @overrides
| def make_output_human_readable(
| self,
| output_dict: Dict[str, torch.Tensor]
| ) -> Dict[str, torch.Tensor]
default_predictor#
class MaskedLanguageModel(Model):
| ...
| default_predictor = "masked_language_model"