data_loader
allennlp.data.data_loaders.data_loader
TensorDict¶
TensorDict = Dict[str, Union[torch.Tensor, Dict[str, torch.Tensor]]]
TensorDict is the type we use for batches.
DataLoader¶
class DataLoader(Registrable)
A DataLoader is responsible for generating batches of instances from a
DatasetReader,
or another source of data.
This is purely an abstract base class. All concrete subclasses must provide implementations of the following methods:
__iter__()that creates an iterable ofTensorDicts,iter_instances()that creates an iterable ofInstances,index_with()that should index the data with a vocabulary, andset_target_device(), which updates the device that batch tensors should be put it when they are generated in__iter__().
Additionally, this class should also implement __len__() when possible.
The default implementation is
MultiProcessDataLoader.
default_implementation¶
class DataLoader(Registrable):
| ...
| default_implementation = "multiprocess"
__iter__¶
class DataLoader(Registrable):
| ...
| def __iter__(self) -> Iterator[TensorDict]
iter_instances¶
class DataLoader(Registrable):
| ...
| def iter_instances(self) -> Iterator[Instance]
index_with¶
class DataLoader(Registrable):
| ...
| def index_with(self, vocab: Vocabulary) -> None
set_target_device¶
class DataLoader(Registrable):
| ...
| def set_target_device(self, device: torch.device) -> None