@Seq2VecEncoder.register("cnn") class CnnEncoder(Seq2VecEncoder): | def __init__( | self, | embedding_dim: int, | num_filters: int, | ngram_filter_sizes: Tuple[int, ...] = (2, 3, 4, 5), | conv_layer_activation: Activation = None, | output_dim: Optional[int] = None | ) -> None
CnnEncoder is a combination of multiple convolution layers and max pooling layers. As a
Seq2VecEncoder, the input to this module is of shape
input_dim), and the output is of shape
The CNN has one convolution layer for each ngram filter size. Each convolution operation gives
out a vector of size num_filters. The number of times a convolution layer will be used
num_tokens - ngram_size + 1. The corresponding maxpooling layer aggregates all these
outputs from the convolution layer and outputs the max.
This operation is repeated for every ngram size passed, and consequently the dimensionality of
the output after maxpooling is
len(ngram_filter_sizes) * num_filters. This then gets
(optionally) projected down to a lower dimensional output, specified by
We then use a fully connected layer to project in back to the desired output_dim. For more details, refer to "A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification", Zhang and Wallace 2016, particularly Figure 1.
Registered as a
Seq2VecEncoder with name "cnn".
- embedding_dim :
This is the input dimension to the encoder. We need this because we can't do shape inference in pytorch, and we need to know what size filters to construct in the CNN.
- num_filters :
This is the output dim for each convolutional layer, which is the number of "filters" learned by that layer.
- ngram_filter_sizes :
Tuple[int], optional (default =
(2, 3, 4, 5))
This specifies both the number of convolutional layers we will create and their sizes. The default of
(2, 3, 4, 5)will have four convolutional layers, corresponding to encoding ngrams of size 2 to 5 with some number of filters.
- conv_layer_activation :
Activation, optional (default =
Activation to use after the convolution layers.
- output_dim :
Optional[int], optional (default =
After doing convolutions and pooling, we'll project the collected features into a vector of this size. If this value is
None, we will just return the result of the max pooling, giving an output of shape
len(ngram_filter_sizes) * num_filters.
class CnnEncoder(Seq2VecEncoder): | ... | @overrides | def get_input_dim(self) -> int
class CnnEncoder(Seq2VecEncoder): | ... | @overrides | def get_output_dim(self) -> int
class CnnEncoder(Seq2VecEncoder): | ... | def forward(self, tokens: torch.Tensor, mask: torch.BoolTensor)