gated_cnn_encoder
allennlp.modules.seq2seq_encoders.gated_cnn_encoder
ResidualBlock#
class ResidualBlock(torch.nn.Module):
| def __init__(
| self,
| input_dim: int,
| layers: Sequence[Sequence[int]],
| direction: str,
| do_weight_norm: bool = True,
| dropout: float = 0.0
| ) -> None
forward#
class ResidualBlock(torch.nn.Module):
| ...
| def forward(self, x: torch.Tensor) -> torch.Tensor
x = (batch_size, dim, timesteps) outputs: (batch_size, dim, timesteps) = f(x) + x
GatedCnnEncoder#
@Seq2SeqEncoder.register("gated-cnn-encoder")
class GatedCnnEncoder(Seq2SeqEncoder):
| def __init__(
| self,
| input_dim: int,
| layers: Sequence[Sequence[Sequence[int]]],
| dropout: float = 0.0,
| return_all_layers: bool = False
| ) -> None
This is work-in-progress and has not been fully tested yet. Use at your own risk!
A Seq2SeqEncoder
that uses a Gated CNN.
see
Language Modeling with Gated Convolutional Networks, Yann N. Dauphin et al, ICML 2017 https://arxiv.org/abs/1612.08083
Convolutional Sequence to Sequence Learning, Jonas Gehring et al, ICML 2017 https://arxiv.org/abs/1705.03122
Some possibilities:
Each element of the list is wrapped in a residual block: input_dim = 512 layers = [ [[4, 512]], [[4, 512], [4, 512]], [[4, 512], [4, 512]], [[4, 512], [4, 512]] dropout = 0.05
A "bottleneck architecture" input_dim = 512 layers = [ [[4, 512]], [[1, 128], [5, 128], [1, 512]], ... ]
An architecture with dilated convolutions input_dim = 512 layers = [ [[2, 512, 1]], [[2, 512, 2]], [[2, 512, 4]], [[2, 512, 8]], # receptive field == 16 [[2, 512, 1]], [[2, 512, 2]], [[2, 512, 4]], [[2, 512, 8]], # receptive field == 31 [[2, 512, 1]], [[2, 512, 2]], [[2, 512, 4]], [[2, 512, 8]], # receptive field == 46 [[2, 512, 1]], [[2, 512, 2]], [[2, 512, 4]], [[2, 512, 8]], # receptive field == 57 ]
Registered as a Seq2SeqEncoder
with name "gated-cnn-encoder".
Parameters
- input_dim :
int
The dimension of the inputs. - layers :
Sequence[Sequence[Sequence[int]]]
The layer dimensions for eachResidualBlock
. - dropout :
float
, optional (default =0.0
)
The dropout for eachResidualBlock
. - return_all_layers :
bool
, optional (default =False
)
Whether to return all layers or just the last layer.
forward#
class GatedCnnEncoder(Seq2SeqEncoder):
| ...
| def forward(
| self,
| token_embeddings: torch.Tensor,
| mask: torch.BoolTensor
| )
Convolutions need transposed input
get_input_dim#
class GatedCnnEncoder(Seq2SeqEncoder):
| ...
| def get_input_dim(self) -> int
get_output_dim#
class GatedCnnEncoder(Seq2SeqEncoder):
| ...
| def get_output_dim(self) -> int
is_bidirectional#
class GatedCnnEncoder(Seq2SeqEncoder):
| ...
| def is_bidirectional(self) -> bool