stacked_alternating_lstm

allennlp.modules.stacked_alternating_lstm

A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards.

TensorPair¶

TensorPair = Tuple[torch.Tensor, torch.Tensor]

StackedAlternatingLstm¶

class StackedAlternatingLstm(torch.nn.Module):
 | def __init__(
 |     self,
 |     input_size: int,
 |     hidden_size: int,
 |     num_layers: int,
 |     recurrent_dropout_probability: float = 0.0,
 |     use_highway: bool = True,
 |     use_input_projection_bias: bool = True
 | ) -> None

A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards. This implementation is based on the description in Deep Semantic Role Labeling - What works and what's next.

Parameters¶

input_size : int
The dimension of the inputs to the LSTM.
hidden_size : int
The dimension of the outputs of the LSTM.
num_layers : int
The number of stacked LSTMs to use.
recurrent_dropout_probability : float, optional (default = 0.0)
The dropout probability to be used in a dropout scheme as stated in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks.
use_input_projection_bias : bool, optional (default = True)
Whether or not to use a bias on the input projection layer. This is mainly here for backwards compatibility reasons and will be removed (and set to False) in future releases.

Returns¶

output_accumulator : PackedSequence
The outputs of the interleaved LSTMs per timestep. A tensor of shape (batch_size, max_timesteps, hidden_size) where for a given batch element, all outputs past the sequence length for that batch are zero tensors.

forward¶

class StackedAlternatingLstm(torch.nn.Module):
 | ...
 | def forward(
 |     self,
 |     inputs: PackedSequence,
 |     initial_state: Optional[TensorPair] = None
 | ) -> Tuple[Union[torch.Tensor, PackedSequence], TensorPair]

Parameters¶

inputs : PackedSequence
A batch first PackedSequence to run the stacked LSTM over.
initial_state : Tuple[torch.Tensor, torch.Tensor], optional (default = None)
A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Each tensor has shape (1, batch_size, output_dimension).

Returns¶

output_sequence : PackedSequence
The encoded sequence of shape (batch_size, sequence_length, hidden_size)
final_states : Tuple[torch.Tensor, torch.Tensor]
The per-layer final (state, memory) states of the LSTM, each with shape (num_layers, batch_size, hidden_size).