LIBRISPEECH — Torchaudio 2.7.0 documentation (original) (raw)

class torchaudio.datasets.LIBRISPEECH(root: Union[str, Path], url: str = 'train-clean-100', folder_in_archive: str = 'LibriSpeech', download: bool = False)[source]

LibriSpeech [Panayotov et al., 2015] dataset.

Parameters:

__getitem__

LIBRISPEECH.__getitem__(n: int) → Tuple[Tensor, int, str, int, int, int][source]

Load the n-th sample from the dataset.

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

Tensor:

Waveform

int:

Sample rate

str:

Transcript

int:

Speaker ID

int:

Chapter ID

int:

Utterance ID

get_metadata

LIBRISPEECH.get_metadata(n: int) → Tuple[str, int, str, int, int, int][source]

Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as __getitem__().

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

str:

Path to audio

int:

Sample rate

str:

Transcript

int:

Speaker ID

int:

Chapter ID

int:

Utterance ID