@Attention.register("additive") class AdditiveAttention(Attention): | def __init__( | self, | vector_dim: int, | matrix_dim: int, | normalize: bool = True | ) -> None
Computes attention between a vector and a matrix using an additive attention function. This
function has two matrices
U and a vector
V. The similarity between the vector
x and the matrix
y is computed as
V tanh(Wx + Uy).
This attention is often referred as concat or additive attention. It was introduced in Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al, 2015).
Registered as an
Attention with name "additive".
- vector_dim :
The dimension of the vector,
x, described above. This is
x.size()[-1]- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
- matrix_dim :
The dimension of the matrix,
y, described above. This is
y.size()[-1]- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
- normalize :
bool, optional (default =
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.
class AdditiveAttention(Attention): | ... | def reset_parameters(self)