Skip to content





class SinusoidalPositionalEncoding(torch.nn.Module,  FromParams):
 | def __init__(
 |     self,
 |     min_timescale: float = 1.0,
 |     max_timescale: float = 1.0e4
 | )

Implements the frequency-based positional encoding described in Attention is All you Need.

Adds sinusoids of different frequencies to a Tensor. A sinusoid of a different frequency and phase is added to each dimension of the input Tensor. This allows the attention heads to use absolute and relative positions.

The number of timescales is equal to hidden_dim / 2 within the range (min_timescale, max_timescale). For each timescale, the two sinusoidal signals sin(timestep / timescale) and cos(timestep / timescale) are generated and concatenated along the hidden_dim dimension.


  • tensor : torch.Tensor
    a Tensor with shape (batch_size, timesteps, hidden_dim).
  • min_timescale : float, optional (default = 1.0)
    The smallest timescale to use.
  • max_timescale : float, optional (default = 1.0e4)
    The largest timescale to use.


  • torch.Tensor
    The input tensor augmented with the sinusoidal frequencies.


class SinusoidalPositionalEncoding(torch.nn.Module,  FromParams):
 | ...
 | def forward(self, input_tensor: torch.Tensor)

Adds a positional encoding to input_tensor.