additive_attention

allennlp.modules.attention.additive_attention

AdditiveAttention¶

@Attention.register("additive")
class AdditiveAttention(Attention):
 | def __init__(
 |     self,
 |     vector_dim: int,
 |     matrix_dim: int,
 |     normalize: bool = True
 | ) -> None

Computes attention between a vector and a matrix using an additive attention function. This function has two matrices W, U and a vector V. The similarity between the vector x and the matrix y is computed as V tanh(Wx + Uy).

This attention is often referred as concat or additive attention. It was introduced in Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al, 2015).

Registered as an Attention with name "additive".

Parameters¶

vector_dim : int
The dimension of the vector, x, described above. This is x.size()[-1] - the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
matrix_dim : int
The dimension of the matrix, y, described above. This is y.size()[-1] - the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
normalize : bool, optional (default = True)
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.

reset_parameters¶

class AdditiveAttention(Attention):
 | ...
 | def reset_parameters(self)