transformer_superglue_rte
allennlp_models.pair_classification.dataset_readers.transformer_superglue_rte
TransformerSuperGlueRteReader#
@DatasetReader.register("transformer_superglue_rte")
class TransformerSuperGlueRteReader(DatasetReader):
| def __init__(
| self,
| transformer_model_name: str = "roberta-base",
| tokenizer_kwargs: Dict[str, Any] = None,
| **kwargs
| ) -> None
Dataset reader for the SuperGLUE Recognizing Textual Entailment task, to be used with a transformer model such as RoBERTa. The dataset is in the JSON Lines format.
It will generate Instances
with the following fields:
tokens
, aTextField
that contains the concatenation of premise and hypothesis,label
, aLabelField
containing the label, if one exists.metadata
, aMetadataField
that stores the instance's index in the file, the original premise, the original hypothesis, both of these in tokenized form, and the gold label, accessible asmetadata['index']
,metadata['premise']
,metadata['hypothesis']
,metadata['tokens']
, andmetadata['label']
.
Parameters¶
- type :
str
, optional (default ='roberta-base'
)
This reader chooses tokenizer according to this setting.
text_to_instance#
class TransformerSuperGlueRteReader(DatasetReader):
| ...
| def text_to_instance(
| self,
| index: int,
| label: str,
| premise: str,
| hypothesis: str
| ) -> Instance
apply_token_indexers#
class TransformerSuperGlueRteReader(DatasetReader):
| ...
| def apply_token_indexers(self, instance: Instance) -> None