Datasets — Torchvision 0.22 documentation (original) (raw)

Torchvision provides many built-in datasets in the torchvision.datasetsmodule, as well as utility classes for building your own datasets.

Built-in datasets

All datasets are subclasses of torch.utils.data.Dataseti.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoaderwhich can load multiple samples in parallel using torch.multiprocessing workers. For example:

imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/') data_loader = torch.utils.data.DataLoader(imagenet_data, batch_size=4, shuffle=True, num_workers=args.nThreads)

All the datasets have almost similar API. They all have two common arguments:transform and target_transform to transform the input and target respectively. You can also create your own datasets using the provided base classes.

Warning

When a dataset object is created with download=True, the files are first downloaded and extracted in the root directory. This download logic is not multi-process safe, so it may lead to conflicts / race conditions if it is run within a distributed setting. In distributed mode, we recommend creating a dummy dataset object to trigger the download logic before setting up distributed mode.

Image classification

Caltech101(root[, target_type, transform, ...]) Caltech 101 Dataset.
Caltech256(root[, transform, ...]) Caltech 256 Dataset.
CelebA(root[, split, target_type, ...]) Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.
CIFAR10(root[, train, transform, ...]) CIFAR10 Dataset.
CIFAR100(root[, train, transform, ...]) CIFAR100 Dataset.
Country211(root, ~pathlib.Path], split, ...) The Country211 Data Set from OpenAI.
DTD(root, ~pathlib.Path], split, partition, ...) Describable Textures Dataset (DTD).
EMNIST(root, split, **kwargs) EMNIST Dataset.
EuroSAT(root, ~pathlib.Path], transform, ...) RGB version of the EuroSAT Dataset.
FakeData([size, image_size, num_classes, ...]) A fake dataset that returns randomly generated images and returns them as PIL images
FashionMNIST(root[, train, transform, ...]) Fashion-MNIST Dataset.
FER2013(root[, split, transform, ...]) FER2013 Dataset.
FGVCAircraft(root, ~pathlib.Path], split, ...) FGVC Aircraft Dataset.
Flickr8k(root, ~pathlib.Path], ann_file, ...) Flickr8k Entities Dataset.
Flickr30k(root, ann_file, transform, ...) Flickr30k Entities Dataset.
Flowers102(root, ~pathlib.Path], split, ...) Oxford 102 Flower Dataset.
Food101(root, ~pathlib.Path], split, ...) The Food-101 Data Set.
GTSRB(root[, split, transform, ...]) German Traffic Sign Recognition Benchmark (GTSRB) Dataset.
INaturalist(root[, version, target_type, ...]) iNaturalist Dataset.
ImageNet(root[, split]) ImageNet 2012 Classification Dataset.
Imagenette(root, ~pathlib.Path], split, size) Imagenette image classification dataset.
KMNIST(root[, train, transform, ...]) Kuzushiji-MNIST Dataset.
LFWPeople(root, split, image_set, transform, ...) LFW Dataset.
LSUN(root[, classes, transform, ...]) LSUN dataset.
MNIST(root[, train, transform, ...]) MNIST Dataset.
Omniglot(root[, background, transform, ...]) Omniglot Dataset.
OxfordIIITPet(root[, split, target_types, ...]) Oxford-IIIT Pet Dataset.
Places365(root, ~pathlib.Path], split, ...) Places365 classification dataset.
PCAM(root[, split, transform, ...]) PCAM Dataset.
QMNIST(root[, what, compat, train]) QMNIST Dataset.
RenderedSST2(root, ~pathlib.Path], split, ...) The Rendered SST2 Dataset.
SEMEION(root[, transform, target_transform, ...]) SEMEION Dataset.
SBU(root, ~pathlib.Path], transform, ...) SBU Captioned Photo Dataset.
StanfordCars(root, ~pathlib.Path], split, ...) Stanford Cars Dataset
STL10(root[, split, folds, transform, ...]) STL10 Dataset.
SUN397(root, ~pathlib.Path], transform, ...) The SUN397 Data Set.
SVHN(root[, split, transform, ...]) SVHN Dataset.
USPS(root[, train, transform, ...]) USPS Dataset.

Image detection or segmentation

CocoDetection(root, annFile[, transform, ...]) MS Coco Detection Dataset.
CelebA(root[, split, target_type, ...]) Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.
Cityscapes(root[, split, mode, target_type, ...]) Cityscapes Dataset.
Kitti(root[, train, transform, ...]) KITTI Dataset.
OxfordIIITPet(root[, split, target_types, ...]) Oxford-IIIT Pet Dataset.
SBDataset(root[, image_set, mode, download, ...]) Semantic Boundaries Dataset
VOCSegmentation(root[, year, image_set, ...]) Pascal VOC Segmentation Dataset.
VOCDetection(root[, year, image_set, ...]) Pascal VOC Detection Dataset.
WIDERFace(root[, split, transform, ...]) WIDERFace Dataset.

Optical Flow

FlyingChairs(root[, split, transforms]) FlyingChairs Dataset for optical flow.
FlyingThings3D(root, ~pathlib.Path], split, ...) FlyingThings3D dataset for optical flow.
HD1K(root, ~pathlib.Path], split, ...) HD1K dataset for optical flow.
KittiFlow(root, ~pathlib.Path], split, ...) KITTI dataset for optical flow (2015).
Sintel(root, ~pathlib.Path], split, ...) Sintel Dataset for optical flow.

Stereo Matching

CarlaStereo(root[, transforms]) Carla simulator data linked in the CREStereo github repo.
Kitti2012Stereo(root[, split, transforms]) KITTI dataset from the 2012 stereo evaluation benchmark.
Kitti2015Stereo(root[, split, transforms]) KITTI dataset from the 2015 stereo evaluation benchmark.
CREStereo(root[, transforms]) Synthetic dataset used in training the CREStereo architecture.
FallingThingsStereo(root[, variant, transforms]) FallingThings dataset.
SceneFlowStereo(root[, variant, pass_name, ...]) Dataset interface for Scene Flow datasets.
SintelStereo(root[, pass_name, transforms]) Sintel Stereo Dataset.
InStereo2k(root[, split, transforms]) InStereo2k dataset.
ETH3DStereo(root[, split, transforms]) ETH3D Low-Res Two-View dataset.
Middlebury2014Stereo(root[, split, ...]) Publicly available scenes from the Middlebury dataset 2014 version https://vision.middlebury.edu/stereo/data/scenes2014/.

Image pairs

Image captioning

Video classification

Video prediction

Base classes for custom datasets

DatasetFolder(root, loader[, extensions, ...]) A generic data loader.
ImageFolder(root, ~pathlib.Path], transform, ...) A generic data loader where the images are arranged in this way by default: .
VisionDataset([root, transforms, transform, ...]) Base Class For making datasets which are compatible with torchvision.

Transforms v2