Skip to content




OnePath = Union[str, PathLike]


ManyPaths = Sequence[OnePath]


ImagesWithSize = Tuple[FloatTensor, IntTensor]


class ImageLoader(Registrable):
 | def __init__(
 |     self,
 |     *, size_divisibility: int = 0,
 |     *, pad_value: float = 0.0,
 |     *, device: Union[str, torch.device] = "cpu"
 | ) -> None

An ImageLoader is a callable that takes as input one or more filenames, and outputs two tensors: one representing the images themselves, and one that just holds the sizes of each image.

The first tensor is the images and is of shape (batch_size, color_channels, height, width). The second tensor is the sizes and is of shape (batch_size, 2), where the last dimension contains the height and width, respectively.

If only a single image is passed (as a Path or str, instead of a list) then the batch dimension will be removed.

Subclasses only need to implement the load() method, which should load a single image from a path.


  • size_divisibility : int, optional (default = 0)
    If set to a positive number, padding will be added so that the height and width dimensions are divisible by size_divisibility. Certain models may require this.

  • pad_value : float, optional (default = 0.0)
    The value to use for padding.

  • device : Union[str, torch.device], optional (default = "cpu")
    A torch device identifier to put the image and size tensors on.


class ImageLoader(Registrable):
 | ...
 | default_implementation = "torch"


class ImageLoader(Registrable):
 | ...
 | def __call__(
 |     self,
 |     filename_or_filenames: Union[OnePath, ManyPaths]
 | ) -> ImagesWithSize


class ImageLoader(Registrable):
 | ...
 | def load(self, filename: OnePath) -> FloatTensor


class TorchImageLoader(ImageLoader):
 | def __init__(
 |     self,
 |     *, image_backend: str = None,
 |     *, resize: bool = True,
 |     *, normalize: bool = True,
 |     *, min_size: int = 800,
 |     *, max_size: int = 1333,
 |     *, pixel_mean: Tuple[float, float, float] = (0.485, 0.456, 0.406),
 |     *, pixel_std: Tuple[float, float, float] = (0.229, 0.224, 0.225),
 |     *, size_divisibility: int = 32,
 |     **kwargs,
 |     *, ,
 | ) -> None

This is just a wrapper around the default image loader from torchvision.


  • image_backend : Optional[str], optional (default = None)
    Set the image backend. Can be one of "PIL" or "accimage".
  • resize : bool, optional (default = True)
    If True (the default), images will be resized when necessary according to the values of min_size and max_size.
  • normalize : bool, optional (default = True)
    If True (the default), images will be normalized according to the values of pixel_mean and pixel_std.
  • min_size : int, optional (default = 800)
    If resize is True, images smaller than this will be resized up to min_size.
  • max_size : int, optional (default = 1333)
    If resize is True, images larger than this will be resized down to max_size.
  • pixel_mean : Tuple[float, float, float], optional (default = (0.485, 0.456, 0.406))
    Mean values for image normalization. The defaults are reasonable for most models from torchvision.
  • pixel_std : Tuple[float, float, float], optional (default = (0.229, 0.224, 0.225))
    Standard deviation for image normalization. The defaults are reasonable for most models from torchvision.
  • size_divisibility : int, optional (default = 32)
    Same parameter as with the ImageLoader base class, but the default here is different.


class TorchImageLoader(ImageLoader):
 | ...
 | def load(self, filename: OnePath) -> FloatTensor