stacked_alternating_lstm
allennlp.modules.stacked_alternating_lstm
A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards.
TensorPair¶
TensorPair = Tuple[torch.Tensor, torch.Tensor]
StackedAlternatingLstm¶
class StackedAlternatingLstm(torch.nn.Module):
| def __init__(
| self,
| input_size: int,
| hidden_size: int,
| num_layers: int,
| recurrent_dropout_probability: float = 0.0,
| use_highway: bool = True,
| use_input_projection_bias: bool = True
| ) -> None
A stacked LSTM with LSTM layers which alternate between going forwards over the sequence and going backwards. This implementation is based on the description in Deep Semantic Role Labeling - What works and what's next.
Parameters¶
- input_size :
int
The dimension of the inputs to the LSTM. - hidden_size :
int
The dimension of the outputs of the LSTM. - num_layers :
int
The number of stacked LSTMs to use. - recurrent_dropout_probability :
float
, optional (default =0.0
)
The dropout probability to be used in a dropout scheme as stated in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. - use_input_projection_bias :
bool
, optional (default =True
)
Whether or not to use a bias on the input projection layer. This is mainly here for backwards compatibility reasons and will be removed (and set to False) in future releases.
Returns¶
- output_accumulator :
PackedSequence
The outputs of the interleaved LSTMs per timestep. A tensor of shape (batch_size, max_timesteps, hidden_size) where for a given batch element, all outputs past the sequence length for that batch are zero tensors.
forward¶
class StackedAlternatingLstm(torch.nn.Module):
| ...
| def forward(
| self,
| inputs: PackedSequence,
| initial_state: Optional[TensorPair] = None
| ) -> Tuple[Union[torch.Tensor, PackedSequence], TensorPair]
Parameters¶
- inputs :
PackedSequence
A batch firstPackedSequence
to run the stacked LSTM over. - initial_state :
Tuple[torch.Tensor, torch.Tensor]
, optional (default =None
)
A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Each tensor has shape (1, batch_size, output_dimension).
Returns¶
- output_sequence :
PackedSequence
The encoded sequence of shape (batch_size, sequence_length, hidden_size) - final_states :
Tuple[torch.Tensor, torch.Tensor]
The per-layer final (state, memory) states of the LSTM, each with shape (num_layers, batch_size, hidden_size).