additive_attention
allennlp.modules.attention.additive_attention
AdditiveAttention¶
@Attention.register("additive")
class AdditiveAttention(Attention):
| def __init__(
| self,
| vector_dim: int,
| matrix_dim: int,
| normalize: bool = True
| ) -> None
Computes attention between a vector and a matrix using an additive attention function. This
function has two matrices W
, U
and a vector V
. The similarity between the vector
x
and the matrix y
is computed as V tanh(Wx + Uy)
.
This attention is often referred as concat or additive attention. It was introduced in Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al, 2015).
Registered as an Attention
with name "additive".
Parameters¶
- vector_dim :
int
The dimension of the vector,x
, described above. This isx.size()[-1]
- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly. - matrix_dim :
int
The dimension of the matrix,y
, described above. This isy.size()[-1]
- the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly. - normalize :
bool
, optional (default =True
)
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.
reset_parameters¶
class AdditiveAttention(Attention):
| ...
| def reset_parameters(self)