text_only
allennlp.tango.text_only
AllenNLP Tango is an experimental API and parts of it might change or disappear every time we release a new version.
TextOnlyDataset¶
@Step.register("text_only")
class TextOnlyDataset(Step)
This step converts a dataset into another dataset that contains only the strings from the original dataset.
You can specify exactly which fields to keep from the original dataset (default is all of them). You can specify a minimum length of string to keep, to filter out strings that are too short.
DETERMINISTIC¶
class TextOnlyDataset(Step):
| ...
| DETERMINISTIC = True
run¶
class TextOnlyDataset(Step):
| ...
| def run(
| self,
| input: DatasetDict,
| *,
| fields_to_keep: Optional[Set[str]] = None,
| min_length: Optional[int] = None
| ) -> DatasetDict
Turns the input
dataset into another dataset that contains only the strings from the
original dataset.
fields_to_keep
is an optional list of field names that you want to keep in the result. If this isNone
, all fields are kept.min_length
specifies the minimum length that a string must have to be part of the result. If this isNone
, all strings are considered.