transformer_superglue_rte
allennlp_models.pair_classification.dataset_readers.transformer_superglue_rte
TransformerSuperGlueRteReader#
@DatasetReader.register("transformer_superglue_rte")
class TransformerSuperGlueRteReader(DatasetReader):
| def __init__(
| self,
| transformer_model_name: str = "roberta-base",
| tokenizer_kwargs: Dict[str, Any] = None,
| **kwargs
| ) -> None
Dataset reader for the SuperGLUE Recognizing Textual Entailment task, to be used with a transformer model such as RoBERTa. The dataset is in the JSON Lines format.
It will generate Instances with the following fields:
tokens, aTextFieldthat contains the concatenation of premise and hypothesis,label, aLabelFieldcontaining the label, if one exists.metadata, aMetadataFieldthat stores the instance's index in the file, the original premise, the original hypothesis, both of these in tokenized form, and the gold label, accessible asmetadata['index'],metadata['premise'],metadata['hypothesis'],metadata['tokens'], andmetadata['label'].
Parameters¶
- type :
str, optional (default ='roberta-base')
This reader chooses tokenizer according to this setting.
text_to_instance#
class TransformerSuperGlueRteReader(DatasetReader):
| ...
| def text_to_instance(
| self,
| index: int,
| label: str,
| premise: str,
| hypothesis: str
| ) -> Instance
apply_token_indexers#
class TransformerSuperGlueRteReader(DatasetReader):
| ...
| def apply_token_indexers(self, instance: Instance) -> None