Skip to content

samplers

[ allennlp.data.samplers.samplers ]


Sampler Objects#

class Sampler(Registrable)

A copy of the pytorch Sampler which allows us to register it with Registrable.

BatchSampler Objects#

class BatchSampler(Registrable)

A copy of the pytorch BatchSampler which allows us to register it with Registrable.

SequentialSampler Objects#

class SequentialSampler(data.SequentialSampler,  Sampler):
 | def __init__(self, data_source: data.Dataset)

A registrable version of pytorch's SequentialSampler.

Registered as a Sampler with name "sequential".

In a typical AllenNLP configuration file, data_source parameter does not get an entry under the "sampler", it gets constructed separately.

RandomSampler Objects#

class RandomSampler(data.RandomSampler,  Sampler):
 | def __init__(
 |     self,
 |     data_source: data.Dataset,
 |     replacement: bool = False,
 |     num_samples: int = None
 | )

A registrable version of pytorch's RandomSampler. Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify num_samples to draw.

Registered as a Sampler with name "random".

Parameters

  • data_source : Dataset
    The dataset to sample from.

    In a typical AllenNLP configuration file, this parameter does not get an entry under the "sampler", it gets constructed separately. - replacement : bool, optional (default = False)
    Samples are drawn with replacement if True. - num_samples : int, optional (default = len(dataset))
    The number of samples to draw. This argument is supposed to be specified only when replacement is True.

SubsetRandomSampler Objects#

class SubsetRandomSampler(data.SubsetRandomSampler,  Sampler):
 | def __init__(self, indices: List[int])

A registrable version of pytorch's SubsetRandomSampler. Samples elements randomly from a given list of indices, without replacement.

Registered as a Sampler with name "subset_random".

Parameters

  • indices : List[int]
    a sequence of indices to sample from.

WeightedRandomSampler Objects#

class WeightedRandomSampler(data.WeightedRandomSampler,  Sampler):
 | def __init__(
 |     self,
 |     weights: List[float],
 |     num_samples: int,
 |     replacement: bool = True
 | )

A registrable version of pytorch's WeightedRandomSampler. Samples elements from [0,...,len(weights)-1] with given probabilities (weights).

Registered as a Sampler with name "weighted_random".

Parameters:

weights : List[float] A sequence of weights, not necessary summing up to one. num_samples : int The number of samples to draw. replacement : bool If True, samples are drawn with replacement. If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.

Examples

>>> list(WeightedRandomSampler([0.1, 0.9, 0.4, 0.7, 3.0, 0.6], 5, replacement=True))
[0, 0, 0, 1, 0]
>>> list(WeightedRandomSampler([0.9, 0.4, 0.05, 0.2, 0.3, 0.1], 5, replacement=False))
[0, 1, 4, 3, 2]

BasicBatchSampler Objects#

class BasicBatchSampler(data.BatchSampler,  BatchSampler):
 | def __init__(
 |     self,
 |     sampler: Sampler,
 |     batch_size: int,
 |     drop_last: bool
 | )

A registrable version of pytorch's BatchSampler. Wraps another sampler to yield a mini-batch of indices.

Registered as a BatchSampler with name "basic".

Parameters

  • sampler : Sampler
    The base sampler.
  • batch_size : int
    The size of the batch.
  • drop_last : bool
    If True, the sampler will drop the last batch if its size would be less than batch_size`.

Examples

>>> list(BatchSampler(SequentialSampler(range(10)), batch_size=3, drop_last=False))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
>>> list(BatchSampler(SequentialSampler(range(10)), batch_size=3, drop_last=True))
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]