Profiling data loaders - Amazon SageMaker AI (original) (raw)

In PyTorch, data loader iterators, such as SingleProcessingDataLoaderIter and MultiProcessingDataLoaderIter, are initiated at the beginning of every iteration over a dataset. During the initialization phase, PyTorch turns on worker processes depending on the configured number of workers, establishes data queue to fetch data and pin_memory threads.

To use the PyTorch data loader profiling analysis tool, import the followingPT_dataloader_analysis class:

from smdebug.profiler.analysis.utils.pytorch_dataloader_analysis import PT_dataloader_analysis

Pass the profiling data retrieved as a Pandas frame data object in the Access the profiling data using the pandas data parsing tool section:

pt_analysis = PT_dataloader_analysis(pf)

The following functions are available for the pt_analysis object:

The SMDebug S3SystemMetricsReader class reads the system metrics from the S3 bucket specified to the s3_trial_path parameter.