An attention module that computes the similarity between an input vector and the rows of a matrix.
class Attention(torch.nn.Module, Registrable): | def __init__(self, normalize: bool = True) -> None
Attention takes two inputs: a (batched) vector and a matrix, plus an optional mask on the
rows of the matrix. We compute the similarity between the vector and each row in the matrix,
and then (optionally) perform a softmax over rows using those computed similarities.
- vector: shape
- matrix: shape
(batch_size, num_rows, embedding_dim)
- matrix_mask: shape
(batch_size, num_rows), specifying which rows are just padding.
- attention: shape
- normalize :
bool, optional (default =
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.
class Attention(torch.nn.Module, Registrable): | ... | def forward( | self, | vector: torch.Tensor, | matrix: torch.Tensor, | matrix_mask: torch.BoolTensor = None | ) -> torch.Tensor