Datasets — Torchvision 0.22 documentation (original) (raw)

Torchvision provides many built-in datasets in the torchvision.datasetsmodule, as well as utility classes for building your own datasets.

Built-in datasets¶

All datasets are subclasses of torch.utils.data.Dataseti.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoaderwhich can load multiple samples in parallel using torch.multiprocessing workers. For example:

imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/') data_loader = torch.utils.data.DataLoader(imagenet_data, batch_size=4, shuffle=True, num_workers=args.nThreads)

All the datasets have almost similar API. They all have two common arguments:transform and target_transform to transform the input and target respectively. You can also create your own datasets using the provided base classes.

Warning

When a dataset object is created with download=True, the files are first downloaded and extracted in the root directory. This download logic is not multi-process safe, so it may lead to conflicts / race conditions if it is run within a distributed setting. In distributed mode, we recommend creating a dummy dataset object to trigger the download logic before setting up distributed mode.

Image classification¶

Caltech101(root[, target_type, transform, ...])	Caltech 101 Dataset.
Caltech256(root[, transform, ...])	Caltech 256 Dataset.
CelebA(root[, split, target_type, ...])	Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.
CIFAR10(root[, train, transform, ...])	CIFAR10 Dataset.
CIFAR100(root[, train, transform, ...])	CIFAR100 Dataset.
Country211(root, ~pathlib.Path], split, ...)	The Country211 Data Set from OpenAI.
DTD(root, ~pathlib.Path], split, partition, ...)	Describable Textures Dataset (DTD).
EMNIST(root, split, **kwargs)	EMNIST Dataset.
EuroSAT(root, ~pathlib.Path], transform, ...)	RGB version of the EuroSAT Dataset.
FakeData([size, image_size, num_classes, ...])	A fake dataset that returns randomly generated images and returns them as PIL images
FashionMNIST(root[, train, transform, ...])	Fashion-MNIST Dataset.
FER2013(root[, split, transform, ...])	FER2013 Dataset.
FGVCAircraft(root, ~pathlib.Path], split, ...)	FGVC Aircraft Dataset.
Flickr8k(root, ~pathlib.Path], ann_file, ...)	Flickr8k Entities Dataset.
Flickr30k(root, ann_file, transform, ...)	Flickr30k Entities Dataset.
Flowers102(root, ~pathlib.Path], split, ...)	Oxford 102 Flower Dataset.
Food101(root, ~pathlib.Path], split, ...)	The Food-101 Data Set.
GTSRB(root[, split, transform, ...])	German Traffic Sign Recognition Benchmark (GTSRB) Dataset.
INaturalist(root[, version, target_type, ...])	iNaturalist Dataset.
ImageNet(root[, split])	ImageNet 2012 Classification Dataset.
Imagenette(root, ~pathlib.Path], split, size)	Imagenette image classification dataset.
KMNIST(root[, train, transform, ...])	Kuzushiji-MNIST Dataset.
LFWPeople(root, split, image_set, transform, ...)	LFW Dataset.
LSUN(root[, classes, transform, ...])	LSUN dataset.
MNIST(root[, train, transform, ...])	MNIST Dataset.
Omniglot(root[, background, transform, ...])	Omniglot Dataset.
OxfordIIITPet(root[, split, target_types, ...])	Oxford-IIIT Pet Dataset.
Places365(root, ~pathlib.Path], split, ...)	Places365 classification dataset.
PCAM(root[, split, transform, ...])	PCAM Dataset.
QMNIST(root[, what, compat, train])	QMNIST Dataset.
RenderedSST2(root, ~pathlib.Path], split, ...)	The Rendered SST2 Dataset.
SEMEION(root[, transform, target_transform, ...])	SEMEION Dataset.
SBU(root, ~pathlib.Path], transform, ...)	SBU Captioned Photo Dataset.
StanfordCars(root, ~pathlib.Path], split, ...)	Stanford Cars Dataset
STL10(root[, split, folds, transform, ...])	STL10 Dataset.
SUN397(root, ~pathlib.Path], transform, ...)	The SUN397 Data Set.
SVHN(root[, split, transform, ...])	SVHN Dataset.
USPS(root[, train, transform, ...])	USPS Dataset.

Image detection or segmentation¶

CocoDetection(root, annFile[, transform, ...])	MS Coco Detection Dataset.
CelebA(root[, split, target_type, ...])	Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.
Cityscapes(root[, split, mode, target_type, ...])	Cityscapes Dataset.
Kitti(root[, train, transform, ...])	KITTI Dataset.
OxfordIIITPet(root[, split, target_types, ...])	Oxford-IIIT Pet Dataset.
SBDataset(root[, image_set, mode, download, ...])	Semantic Boundaries Dataset
VOCSegmentation(root[, year, image_set, ...])	Pascal VOC Segmentation Dataset.
VOCDetection(root[, year, image_set, ...])	Pascal VOC Detection Dataset.
WIDERFace(root[, split, transform, ...])	WIDERFace Dataset.

Optical Flow¶

FlyingChairs(root[, split, transforms])	FlyingChairs Dataset for optical flow.
FlyingThings3D(root, ~pathlib.Path], split, ...)	FlyingThings3D dataset for optical flow.
HD1K(root, ~pathlib.Path], split, ...)	HD1K dataset for optical flow.
KittiFlow(root, ~pathlib.Path], split, ...)	KITTI dataset for optical flow (2015).
Sintel(root, ~pathlib.Path], split, ...)	Sintel Dataset for optical flow.

Stereo Matching¶

CarlaStereo(root[, transforms])	Carla simulator data linked in the CREStereo github repo.
Kitti2012Stereo(root[, split, transforms])	KITTI dataset from the 2012 stereo evaluation benchmark.
Kitti2015Stereo(root[, split, transforms])	KITTI dataset from the 2015 stereo evaluation benchmark.
CREStereo(root[, transforms])	Synthetic dataset used in training the CREStereo architecture.
FallingThingsStereo(root[, variant, transforms])	FallingThings dataset.
SceneFlowStereo(root[, variant, pass_name, ...])	Dataset interface for Scene Flow datasets.
SintelStereo(root[, pass_name, transforms])	Sintel Stereo Dataset.
InStereo2k(root[, split, transforms])	InStereo2k dataset.
ETH3DStereo(root[, split, transforms])	ETH3D Low-Res Two-View dataset.
Middlebury2014Stereo(root[, split, ...])	Publicly available scenes from the Middlebury dataset 2014 version https://vision.middlebury.edu/stereo/data/scenes2014/.

Image pairs¶

Image captioning¶

Video classification¶

Video prediction¶

Base classes for custom datasets¶

DatasetFolder(root, loader[, extensions, ...])	A generic data loader.
ImageFolder(root, ~pathlib.Path], transform, ...)	A generic data loader where the images are arranged in this way by default: .
VisionDataset([root, transforms, transform, ...])	Base Class For making datasets which are compatible with torchvision.