allennlp.modules.stacked_alternating_lstm¶
A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards.
-
class
allennlp.modules.stacked_alternating_lstm.
StackedAlternatingLstm
(input_size: int, hidden_size: int, num_layers: int, recurrent_dropout_probability: float = 0.0, use_highway: bool = True, use_input_projection_bias: bool = True)[source]¶ Bases:
torch.nn.modules.module.Module
A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards. This implementation is based on the description in Deep Semantic Role Labelling - What works and what’s next .
- Parameters
- input_sizeint, required
The dimension of the inputs to the LSTM.
- hidden_sizeint, required
The dimension of the outputs of the LSTM.
- num_layersint, required
The number of stacked LSTMs to use.
- recurrent_dropout_probability: float, optional (default = 0.0)
The dropout probability to be used in a dropout scheme as stated in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks .
- use_input_projection_biasbool, optional (default = True)
Whether or not to use a bias on the input projection layer. This is mainly here for backwards compatibility reasons and will be removed (and set to False) in future releases.
- Returns
- output_accumulatorPackedSequence
The outputs of the interleaved LSTMs per timestep. A tensor of shape (batch_size, max_timesteps, hidden_size) where for a given batch element, all outputs past the sequence length for that batch are zero tensors.
-
forward
(self, inputs: torch.nn.utils.rnn.PackedSequence, initial_state: Union[Tuple[torch.Tensor, torch.Tensor], NoneType] = None) → Tuple[Union[torch.Tensor, torch.nn.utils.rnn.PackedSequence], Tuple[torch.Tensor, torch.Tensor]][source]¶ - Parameters
- inputs
PackedSequence
, required. A batch first
PackedSequence
to run the stacked LSTM over.- initial_stateTuple[torch.Tensor, torch.Tensor], optional, (default = None)
A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Each tensor has shape (1, batch_size, output_dimension).
- inputs
- Returns
- output_sequencePackedSequence
The encoded sequence of shape (batch_size, sequence_length, hidden_size)
- final_states: Tuple[torch.Tensor, torch.Tensor]
The per-layer final (state, memory) states of the LSTM, each with shape (num_layers, batch_size, hidden_size).