[ allennlp.modules.seq2vec_encoders.cls_pooler ]
@Seq2VecEncoder.register("cls_pooler") class ClsPooler(Seq2VecEncoder): | def __init__( | self, | embedding_dim: int = None, | cls_is_last_token: bool = False | )
Just takes the first vector from a list of vectors (which in a transformer is typically the
[CLS] token) and returns it. For BERT, it's recommended to use
Registered as a
Seq2VecEncoder with name "cls_pooler".
- embedding_dim :
This isn't needed for any computation that we do, but we sometimes rely on
get_output_dimto check parameter settings, or to instantiate final linear layers. In order to give the right values there, we need to know the embedding dimension. If you're using this with a transformer from the
transformerslibrary, this can often be found with
model.config.hidden_size, if you're not sure.
- cls_is_last_token :
The [CLS] token is the first token for most of the pretrained transformer models. For some models such as XLNet, however, it is the last token, and we therefore need to select at the end.
class ClsPooler(Seq2VecEncoder): | ... | @overrides | def get_input_dim(self) -> int
class ClsPooler(Seq2VecEncoder): | ... | @overrides | def get_output_dim(self) -> int
class ClsPooler(Seq2VecEncoder): | ... | @overrides | def forward( | self, | tokens: torch.Tensor, | mask: torch.BoolTensor = None | )
tokens is assumed to have shape (batch_size, sequence_length, embedding_dim). mask is assumed to have shape (batch_size, sequence_length) with all 1s preceding all 0s.