Default system monitoring and customized framework profiling with different profiling options (original) (raw)

This section gives information about the supported profiling configuration classes, as well as an example configuration. You can use the following profiling configuration classes to manage the framework profiling options:

Note

Detailed profiling might significantly increase GPU memory consumption. We do not recommend enabling detailed profiling for more than a couple of steps.

Note

Data loader profiling might lower the training performance while collecting information from data loaders. We don't recommend enabling data loader profiling for more than a couple of steps.
Debugger is preconfigured to annotate data loader processes only for the AWS deep learning containers. Debugger cannot profile data loader processes from any other custom or external training containers.

Note

Enabling Python profiling might slow down the overall training time. cProfile profiles the most frequently called Python operators at every call, so the processing time on profiling increases with respect to the number of calls. For Pyinstrument, the cumulative profiling time increases with respect to time because of its sampling mechanism.

The following example configuration shows the full structure when you use the different profiling options with specified values.

import time
from sagemaker.debugger import (ProfilerConfig, 
                                FrameworkProfile, 
                                DetailedProfilingConfig, 
                                DataloaderProfilingConfig, 
                                PythonProfilingConfig,
                                PythonProfiler, cProfileTimer)

profiler_config=ProfilerConfig(
    system_monitor_interval_millis=500,
    framework_profile_params=FrameworkProfile(
        detailed_profiling_config=DetailedProfilingConfig(
            start_step=5, 
            num_steps=1
        ),
        dataloader_profiling_config=DataloaderProfilingConfig(
            start_step=7, 
            num_steps=1
        ),
        python_profiling_config=PythonProfilingConfig(
            start_step=9, 
            num_steps=1, 
            python_profiler=PythonProfiler.CPROFILE, 
            cprofile_timer=cProfileTimer.TOTAL_TIME
        )
    )
)

For more information about available profiling options, see DetailedProfilingConfig, DataloaderProfilingConfig, and PythonProfilingConfig in the Amazon SageMaker Python SDK.