transformer_mc_tt
allennlp_models.mc.dataset_readers.transformer_mc_tt
TransformerMCReaderTransformerToolkit#
class TransformerMCReaderTransformerToolkit(DatasetReader):
| def __init__(
| self,
| transformer_model_name: str = "roberta-large",
| length_limit: int = 512,
| **kwargs
| ) -> None
Read input data for the TransformerMC model. This is the base class for all readers that produce data for TransformerMCTransformerToolkit.
Instances have three fields:
* alternatives
, a ListField
of TransformerTextField
* correct_alternative
, IndexField
with the correct answer among alternatives
* qid
, a MetadataField
containing question ids
Parameterstransformer_model_name : `str`, optional (default=`"roberta-large"`)¶
This reader chooses tokenizer and token indexer according to this setting.
length_limit : int
, optional (default=512
)
We will make sure that the length of an alternative never exceeds this many word pieces.
text_to_instance#
class TransformerMCReaderTransformerToolkit(DatasetReader):
| ...
| def text_to_instance(
| self,
| qid: str,
| start: str,
| alternatives: List[str],
| label: Optional[int] = None
| ) -> Instance