Skip to content

transformer_mc

allennlp_models.mc.dataset_readers.transformer_mc

[SOURCE]


TransformerMCReader#

class TransformerMCReader(DatasetReader):
 | def __init__(
 |     self,
 |     transformer_model_name: str = "roberta-large",
 |     length_limit: int = 512,
 |     **kwargs
 | ) -> None

Read input data for the TransformerMC model. This is the base class for all readers that produce data for TransformerMC.

Instances have two fields: * alternatives, a ListField of TextField * correct_alternative, IndexField with the correct answer among alternatives * qid, a MetadataField containing question ids

Parameterstransformer_model_name : `str`, optional (default=`"roberta-large"`)

This reader chooses tokenizer and token indexer according to this setting.

length_limit : int, optional (default=512) We will make sure that the length of an alternative never exceeds this many word pieces.

text_to_instance#

class TransformerMCReader(DatasetReader):
 | ...
 | def text_to_instance(
 |     self,
 |     qid: str,
 |     start: str,
 |     alternatives: List[str],
 |     label: Optional[int] = None
 | ) -> Instance

tokenize