transformer_mc
TransformerMCReader#
class TransformerMCReader(DatasetReader):
| def __init__(
| self,
| transformer_model_name: str = "roberta-large",
| length_limit: int = 512,
| **kwargs
| ) -> None
Read input data for the TransformerMC model. This is the base class for all readers that produce data for TransformerMC.
Instances have two fields:
* alternatives
, a ListField of TextField
* correct_alternative
, IndexField with the correct answer among alternatives
Parameterstransformer_model_name : str
, optional (default=roberta-large
)
This reader chooses tokenizer and token indexer according to this setting.
length_limit : int
, optional (default=512
)
We will make sure that the length of an alternative never exceeds this many word pieces.
text_to_instance#
class TransformerMCReader(DatasetReader):
| ...
| @overrides
| def text_to_instance(
| self,
| qid: str,
| start: str,
| alternatives: List[str],
| label: Optional[int]
| ) -> Instance
tokenize