multitask_epoch_sampler
allennlp.data.data_loaders.multitask_epoch_sampler
MultiTaskEpochSampler¶
class MultiTaskEpochSampler(Registrable)
A class that determines with what proportion each dataset should be sampled for a given epoch.
This is used by the MultiTaskDataLoader
. The main output of this class is the task proportion
dictionary returned by get_task_proportions
, which specifies what percentage of the instances
for the current epoch should come from each dataset. To control this behavior as training
progresses, there is an update_from_epoch_metrics
method, which should be called from a
Callback
during training.
get_task_proportions¶
class MultiTaskEpochSampler(Registrable):
| ...
| def get_task_proportions(
| self,
| data_loaders: Mapping[str, DataLoader]
| ) -> Dict[str, float]
Given a dictionary of DataLoaders
for each dataset, returns what percentage of the
instances for the current epoch of training should come from each dataset. The input
dictionary could be used to determine how many datasets there are (e.g., for uniform
sampling) or how big each dataset is (e.g., for sampling based on size), or it could be
ignored entirely.
update_from_epoch_metrics¶
class MultiTaskEpochSampler(Registrable):
| ...
| def update_from_epoch_metrics(
| self,
| epoch_metrics: Dict[str, Any]
| ) -> None
Some implementations of EpochSamplers change their behavior based on current epoch metrics.
This method is meant to be called from a Callback
, to let the sampler update its sampling
proportions. If your sampling technique does not depend on epoch metrics, you do not need
to implement this method.
UniformSampler¶
@MultiTaskEpochSampler.register("uniform")
class UniformSampler(MultiTaskEpochSampler)
Returns a uniform distribution over datasets at every epoch.
Registered as a MultiTaskEpochSampler
with name "uniform".
get_task_proportions¶
class UniformSampler(MultiTaskEpochSampler):
| ...
| def get_task_proportions(
| self,
| data_loaders: Mapping[str, DataLoader]
| ) -> Dict[str, float]
WeightedSampler¶
@MultiTaskEpochSampler.register("weighted")
class WeightedSampler(MultiTaskEpochSampler):
| def __init__(self, weights: Dict[str, float])
Returns a weighted distribution over datasets at every epoch, where every task has a weight.
Registered as a MultiTaskEpochSampler
with name "weighted".
get_task_proportions¶
class WeightedSampler(MultiTaskEpochSampler):
| ...
| def get_task_proportions(
| self,
| data_loaders: Mapping[str, DataLoader]
| ) -> Dict[str, float]
ProportionalSampler¶
@MultiTaskEpochSampler.register("proportional")
class ProportionalSampler(MultiTaskEpochSampler)
Samples from every dataset according to its size. This will have essentially the same effect as
using all of the data at every epoch, but it lets you control for number of instances per epoch,
if you want to do that. This requires that all data loaders have a __len__
(which means no
lazy loading). If you need this functionality with lazy loading, implement your own sampler
that takes dataset sizes as a constructor parameter.
Registered as a MultiTaskEpochSampler
with name "proportional".
get_task_proportions¶
class ProportionalSampler(MultiTaskEpochSampler):
| ...
| def get_task_proportions(
| self,
| data_loaders: Mapping[str, DataLoader]
| ) -> Dict[str, float]