lstm_cell_with_projection
allennlp.modules.lstm_cell_with_projection
An LSTM with Recurrent Dropout, a hidden_state which is projected and clipping on both the hidden state and the memory state of the LSTM.
LstmCellWithProjection#
class LstmCellWithProjection(torch.nn.Module):
| def __init__(
| self,
| input_size: int,
| hidden_size: int,
| cell_size: int,
| go_forward: bool = True,
| recurrent_dropout_probability: float = 0.0,
| memory_cell_clip_value: Optional[float] = None,
| state_projection_clip_value: Optional[float] = None
| ) -> None
An LSTM with Recurrent Dropout and a projected and clipped hidden state and memory. Note: this implementation is slower than the native Pytorch LSTM because it cannot make use of CUDNN optimizations for stacked RNNs due to and variational dropout and the custom nature of the cell state.
Parameters
- input_size :
int
The dimension of the inputs to the LSTM. - hidden_size :
int
The dimension of the outputs of the LSTM. - cell_size :
int
The dimension of the memory cell used for the LSTM. - go_forward :
bool
, optional (default =True
)
The direction in which the LSTM is applied to the sequence. Forwards by default, or backwards if False. - recurrent_dropout_probability :
float
, optional (default =0.0
)
The dropout probability to be used in a dropout scheme as stated in [A Theoretically Grounded Application of Dropout in Recurrent Neural Networks] 0. Implementation wise, this simply applies a fixed dropout mask per sequence to the recurrent connection of the LSTM. - state_projection_clip_value :
float
, optional (default =None
)
The magnitude with which to clip the hidden_state after projecting it. - memory_cell_clip_value :
float
, optional (default =None
)
The magnitude with which to clip the memory cell.
Returns
- output_accumulator :
torch.FloatTensor
The outputs of the LSTM for each timestep. A tensor of shape (batch_size, max_timesteps, hidden_size) where for a given batch element, all outputs past the sequence length for that batch are zero tensors. - final_state :
Tuple[torch.FloatTensor, torch.FloatTensor]
The final (state, memory) states of the LSTM, with shape (1, batch_size, hidden_size) and (1, batch_size, cell_size) respectively. The first dimension is 1 in order to match the Pytorch API for returning stacked LSTM states.
reset_parameters#
class LstmCellWithProjection(torch.nn.Module):
| ...
| def reset_parameters(self)
Use sensible default initializations for parameters.
forward#
class LstmCellWithProjection(torch.nn.Module):
| ...
| def forward(
| self,
| inputs: torch.FloatTensor,
| batch_lengths: List[int],
| initial_state: Optional[Tuple[torch.Tensor, torch.Tensor]] = None
| )
Parameters
- inputs :
torch.FloatTensor
A tensor of shape (batch_size, num_timesteps, input_size) to apply the LSTM over. - batch_lengths :
List[int]
A list of length batch_size containing the lengths of the sequences in batch. - initial_state :
Tuple[torch.Tensor, torch.Tensor]
, optional (default =None
)
A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Thestate
has shape (1, batch_size, hidden_size) and thememory
has shape (1, batch_size, cell_size).
Returns
- output_accumulator :
torch.FloatTensor
The outputs of the LSTM for each timestep. A tensor of shape (batch_size, max_timesteps, hidden_size) where for a given batch element, all outputs past the sequence length for that batch are zero tensors. - final_state :
Tuple[torch.FloatTensor, torch.FloatTensor]
A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Thestate
has shape (1, batch_size, hidden_size) and thememory
has shape (1, batch_size, cell_size).