boolq
allennlp_models.classification.dataset_readers.boolq
BoolQDatasetReader#
@DatasetReader.register("boolq")
class BoolQDatasetReader(DatasetReader):
| def __init__(
| self,
| tokenizer: Tokenizer = None,
| token_indexers: Dict[str, TokenIndexer] = None,
| **kwargs
| )
This DatasetReader is designed to read in the BoolQ data
for binary QA task. It returns a dataset of instances with the
following fields:
The output of read
is a list of Instance
s with the fields:
tokens : TextField
and
label : LabelField
Registered as a DatasetReader
with name "boolq".
Parameters¶
- tokenizer :
Tokenizer
, optional (default =WhitespaceTokenizer()
)
Tokenizer to use to split the input sequences into words or other kinds of tokens. - token_indexers :
Dict[str, TokenIndexer]
, optional (default ={"tokens": SingleIdTokenIndexer()}
)
We use this to define the input representation for the text. SeeTokenIndexer
.
text_to_instance#
class BoolQDatasetReader(DatasetReader):
| ...
| def text_to_instance(
| self,
| passage: str,
| question: str,
| label: Optional[bool] = None
| ) -> Instance
We take the passage and the question as input, tokenize and concat them.
Parameters¶
- passage :
str
The passage in a given BoolQ record. - question :
str
The passage in a given BoolQ record. - label :
bool
, optional (default =None
)
The label for the passage and the question.
Returns¶
- An
Instance
containing the following fields:
tokens :TextField
The tokens in the concatenation of the passage and the question. label :LabelField
The answer to the question.
apply_token_indexers#
class BoolQDatasetReader(DatasetReader):
| ...
| def apply_token_indexers(self, instance: Instance) -> None