speech_dataset_utils — Model Optimizer 0.31.0 (original) (raw)

Utility functions for getting samples and forward loop function for different speech datasets.

Functions

get_speech_dataset_dataloader	Get a dataloader with the dataset name and processor of the target model.
get_supported_speech_datasets	Retrieves a list of speech datasets supported.

get_speech_dataset_dataloader(dataset_name='peoples_speech', processor=None, batch_size=1, num_samples=512, device=None, dtype=None)

Get a dataloader with the dataset name and processor of the target model.

Parameters:

dataset_name (str) – Name of the dataset to load.
processor (WhisperProcessor) – Processor used for encoding images and text data.
batch_size (int) – Batch size of the returned dataloader.
num_samples (int) – Number of samples from the dataset.
device (str | None) – Target device for the returned dataloader.
dtype (dtype | None) – dtype of the returned dataset.

Returns:

An instance of dataloader.

Return type:

DataLoader

get_supported_speech_datasets()

Retrieves a list of speech datasets supported.

Returns:

A list of strings, where each string is the name of a supported dataset.

Return type:

_list_[_str_]

Example usage:

from modelopt.torch.utils import get_supported_speech_datasets

print("Supported datasets:", get_supported_speech_datasets())