Skip to content

basic_classifier

allennlp.models.basic_classifier

[SOURCE]


BasicClassifier

@Model.register("basic_classifier")
class BasicClassifier(Model):
 | def __init__(
 |     self,
 |     vocab: Vocabulary,
 |     text_field_embedder: TextFieldEmbedder,
 |     seq2vec_encoder: Seq2VecEncoder,
 |     seq2seq_encoder: Seq2SeqEncoder = None,
 |     feedforward: Optional[FeedForward] = None,
 |     dropout: float = None,
 |     num_labels: int = None,
 |     label_namespace: str = "labels",
 |     namespace: str = "tokens",
 |     initializer: InitializerApplicator = InitializerApplicator(),
 |     **kwargs
 | ) -> None

This Model implements a basic text classifier. After embedding the text into a text field, we will optionally encode the embeddings with a Seq2SeqEncoder. The resulting sequence is pooled using a Seq2VecEncoder and then passed to a linear classification layer, which projects into the label space. If a Seq2SeqEncoder is not provided, we will pass the embedded text directly to the Seq2VecEncoder.

Registered as a Model with name "basic_classifier".

Parameters

  • vocab : Vocabulary
  • text_field_embedder : TextFieldEmbedder
    Used to embed the input text into a TextField
  • seq2seq_encoder : Seq2SeqEncoder, optional (default = None)
    Optional Seq2Seq encoder layer for the input text.
  • seq2vec_encoder : Seq2VecEncoder
    Required Seq2Vec encoder layer. If seq2seq_encoder is provided, this encoder will pool its output. Otherwise, this encoder will operate directly on the output of the text_field_embedder.
  • feedforward : FeedForward, optional (default = None)
    An optional feedforward layer to apply after the seq2vec_encoder.
  • dropout : float, optional (default = None)
    Dropout percentage to use.
  • num_labels : int, optional (default = None)
    Number of labels to project to in classification layer. By default, the classification layer will project to the size of the vocabulary namespace corresponding to labels.
  • namespace : str, optional (default = "tokens")
    Vocabulary namespace corresponding to the input text. By default, we use the "tokens" namespace.
  • label_namespace : str, optional (default = "labels")
    Vocabulary namespace corresponding to labels. By default, we use the "labels" namespace.
  • initializer : InitializerApplicator, optional (default = InitializerApplicator())
    If provided, will be used to initialize the model parameters.

forward

class BasicClassifier(Model):
 | ...
 | def forward(
 |     self,
 |     tokens: TextFieldTensors,
 |     label: torch.IntTensor = None,
 |     metadata: MetadataField = None
 | ) -> Dict[str, torch.Tensor]

Parameters

  • tokens : TextFieldTensors
    From a TextField
  • label : torch.IntTensor, optional (default = None)
    From a LabelField

Returns

  • An output dictionary consisting of:

    • logits (torch.FloatTensor) : A tensor of shape (batch_size, num_labels) representing unnormalized log probabilities of the label.
    • probs (torch.FloatTensor) : A tensor of shape (batch_size, num_labels) representing probabilities of the label.
    • loss : (torch.FloatTensor, optional) : A scalar loss to be optimised.

make_output_human_readable

class BasicClassifier(Model):
 | ...
 | def make_output_human_readable(
 |     self,
 |     output_dict: Dict[str, torch.Tensor]
 | ) -> Dict[str, torch.Tensor]

Does a simple argmax over the probabilities, converts index to string label, and add "label" key to the dictionary with the result.

get_metrics

class BasicClassifier(Model):
 | ...
 | def get_metrics(self, reset: bool = False) -> Dict[str, float]

default_predictor

class BasicClassifier(Model):
 | ...
 | default_predictor = "text_classifier"