Skip to content

additive_attention

allennlp.modules.attention.additive_attention

[SOURCE]


AdditiveAttention

@Attention.register("additive")
class AdditiveAttention(Attention):
 | def __init__(
 |     self,
 |     vector_dim: int,
 |     matrix_dim: int,
 |     normalize: bool = True
 | ) -> None

Computes attention between a vector and a matrix using an additive attention function. This function has two matrices W, U and a vector V. The similarity between the vector x and the matrix y is computed as V tanh(Wx + Uy).

This attention is often referred as concat or additive attention. It was introduced in Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al, 2015).

Registered as an Attention with name "additive".

Parameters

  • vector_dim : int
    The dimension of the vector, x, described above. This is x.size()[-1] - the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
  • matrix_dim : int
    The dimension of the matrix, y, described above. This is y.size()[-1] - the length of the vector that will go into the similarity computation. We need this so we can build the weight matrix correctly.
  • normalize : bool, optional (default = True)
    If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.

reset_parameters

class AdditiveAttention(Attention):
 | ...
 | def reset_parameters(self)