torchaudio.datasets — Torchaudio 2.7.0 documentation (original) (raw)

All datasets are subclasses of torch.utils.data.Datasetand have __getitem__ and __len__ methods implemented.

Hence, they can all be passed to a torch.utils.data.DataLoaderwhich can load multiple samples parallelly using torch.multiprocessing workers. For example:

yesno_data = torchaudio.datasets.YESNO('.', download=True) data_loader = torch.utils.data.DataLoader( yesno_data, batch_size=1, shuffle=True, num_workers=args.nThreads)

CMUARCTIC	CMU ARCTIC [Kominek et al., 2003] dataset.
CMUDict	CMU Pronouncing Dictionary [Weide, 1998] (CMUDict) dataset.
COMMONVOICE	CommonVoice [Ardila et al., 2020] dataset.
DR_VCTK	Device Recorded VCTK (Small subset version) [Sarfjoo and Yamagishi, 2018] dataset.
FluentSpeechCommands	Fluent Speech Commands [Lugosch et al., 2019] dataset
GTZAN	GTZAN [Tzanetakis et al., 2001] dataset.
IEMOCAP	IEMOCAP [Busso et al., 2008] dataset.
LibriMix	LibriMix [Cosentino et al., 2020] dataset.
LIBRISPEECH	LibriSpeech [Panayotov et al., 2015] dataset.
LibriLightLimited	Subset of Libri-light [Kahn et al., 2020] dataset, which was used in HuBERT [Hsu et al., 2021] for supervised fine-tuning.
LIBRITTS	LibriTTS [Zen et al., 2019] dataset.
LJSPEECH	LJSpeech-1.1 [Ito and Johnson, 2017] dataset.
MUSDB_HQ	MUSDB_HQ [Rafii et al., 2019] dataset.
QUESST14	QUESST14 [Miro et al., 2015] dataset.
Snips	Snips [Coucke et al., 2018] dataset.
SPEECHCOMMANDS	Speech Commands [Warden, 2018] dataset.
TEDLIUM	Tedlium [Rousseau et al., 2012] dataset (releases 1,2 and 3).
VCTK_092	VCTK 0.92 [Yamagishi et al., 2019] dataset
VoxCeleb1Identification	VoxCeleb1 [Nagrani et al., 2017] dataset for speaker identification task.
VoxCeleb1Verification	VoxCeleb1 [Nagrani et al., 2017] dataset for speaker verification task.
YESNO	YesNo [YesNo, n.d.] dataset.