additive_attention
[ allennlp.modules.attention.additive_attention ]
AdditiveAttention#
@Attention.register("additive")
class AdditiveAttention(Attention):
| def __init__(
| self,
| vector_dim: int,
| matrix_dim: int,
| normalize: bool = True
| ) -> None
Computes attention between a vector and a matrix using an additive attention function. This
function has two matrices W, U and a vector V. The similarity between the vector
x and the matrix y is computed as V tanh(Wx + Uy).
This attention is often referred as concat or additive attention. It was introduced in https://arxiv.org/abs/1409.0473 by Bahdanau et al.
Registered as an Attention with name "additive".
Parameters
- vector_dim :
int
The dimension of the vector,x, described above. This isx.size()[-1]- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly. - matrix_dim :
int
The dimension of the matrix,y, described above. This isy.size()[-1]- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly. - normalize :
bool, optional (default =True)
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.
reset_parameters#
class AdditiveAttention(Attention):
| ...
| def reset_parameters(self)