Skip to content

AllenNLP v2.10.1

scaled_dot_product_attention

scaled_dot_product_attention

allennlp.modules.attention.scaled_dot_product_attention

ScaledDotProductAttention¶

@Attention.register("scaled_dot_product")
class ScaledDotProductAttention(DotProductAttention):
 | def __init__(
 |     self,
 |     scaling_factor: Optional[int] = None,
 |     normalize: bool = True
 | ) -> None

Computes attention between two tensors using scaled dot product.

Reference: [Attention Is All You Need (Vaswani et al, 2017)]¶

(https://api.semanticscholar.org/CorpusID:13756489)¶

Registered as an Attention with name "scaled_dot_product".

Parameters¶

scaling_factor : int
The similarity score is scaled down by the scaling_factor.
normalize : bool, optional (default = True)
If true, we normalize the computed similarities with a softmax, to return a probability distribution for your attention. If false, this is just computing a similarity score.