Skip to content

transformer_superglue_rte

allennlp_models.pair_classification.dataset_readers.transformer_superglue_rte

[SOURCE]


TransformerSuperGlueRteReader#

@DatasetReader.register("transformer_superglue_rte")
class TransformerSuperGlueRteReader(DatasetReader):
 | def __init__(
 |     self,
 |     transformer_model_name: str = "roberta-base",
 |     tokenizer_kwargs: Dict[str, Any] = None,
 |     **kwargs
 | ) -> None

Dataset reader for the SuperGLUE Recognizing Textual Entailment task, to be used with a transformer model such as RoBERTa. The dataset is in the JSON Lines format.

It will generate Instances with the following fields:

  • tokens, a TextField that contains the concatenation of premise and hypothesis,
  • label, a LabelField containing the label, if one exists.
  • metadata, a MetadataField that stores the instance's index in the file, the original premise, the original hypothesis, both of these in tokenized form, and the gold label, accessible as metadata['index'], metadata['premise'], metadata['hypothesis'], metadata['tokens'], and metadata['label'].

Parameters

  • type : str, optional (default = 'roberta-base')
    This reader chooses tokenizer according to this setting.

text_to_instance#

class TransformerSuperGlueRteReader(DatasetReader):
 | ...
 | def text_to_instance(
 |     self,
 |     index: int,
 |     label: str,
 |     premise: str,
 |     hypothesis: str
 | ) -> Instance

apply_token_indexers#

class TransformerSuperGlueRteReader(DatasetReader):
 | ...
 | def apply_token_indexers(self, instance: Instance) -> None