memory_monitor — Model Optimizer 0.27.1 (original) (raw)
GPU Memory Monitoring Utilities.
This module provides utilities for monitoring GPU memory usage in real-time using NVIDIA Management Library (NVML). It includes a GPUMemoryMonitor class that tracks peak memory usage across all available GPUs and provides functionality to start/stop monitoring in a separate thread.
Classes:
GPUMemoryMonitor: A class that monitors GPU memory usage and tracks peak memory consumption.
Functions:
launch_memory_monitor: Helper function to create and start a GPU memory monitor instance.
Example
monitor = launch_memory_monitor(monitor_interval=1.0)
Run your GPU operations
monitor.stop() # Will print peak memory usage per GPU
Note
This module requires the NVIDIA Management Library (NVML) through the pynvml package. It automatically initializes NVML when creating a monitor instance and shuts it down when monitoring is stopped.
Dependencies:
- pynvml: For accessing NVIDIA GPU metrics
- threading: For running the monitor in a background thread
- atexit: For ensuring proper cleanup when the program exits
Classes
GPUMemoryMonitor | GPU Memory Monitor for tracking NVIDIA GPU memory usage. |
---|
Functions
launch_memory_monitor | Launch a GPU memory monitor in a separate thread. |
---|
class GPUMemoryMonitor
Bases: object
GPU Memory Monitor for tracking NVIDIA GPU memory usage.
This class provides functionality to monitor and track peak memory usage across all available NVIDIA GPUs on the system. It runs in a separate thread and periodically samples memory usage.
__init__(monitor_interval=10.0)
Initialize a NVIDIA GPU memory monitor.
This class monitors the memory usage of NVIDIA GPUs at specified intervals. It initializes NVIDIA Management Library (NVML) and gets the count of available GPUs.
Parameters:
monitor_interval (float , optional) – Time interval in seconds between memory usage checks. Defaults to 10.0.
monitor_interval
Time interval between memory checks.
Type:
float
peak_memory
Dictionary mapping GPU indices to their peak memory usage.
Type:
dict
is_running
Flag indicating if the monitor is currently running.
Type:
bool
monitor_thread
Thread object for memory monitoring.
device_count
Number of NVIDIA GPUs available in the system.
Type:
int
Raises:
NVMLError – If NVIDIA Management Library initialization fails.
Parameters:
monitor_interval (float) –
start()
Start the GPU memory monitoring in a separate daemon thread.
This method initializes and starts a daemon thread that continuously monitors GPU memory usage at the specified interval. The thread will run until stop() is called or the program exits.
stop()
Stop the GPU memory monitoring and display peak memory usage.
This method stops the monitoring thread, prints the peak memory usage for each GPU that was monitored, and properly shuts down the NVML interface. It will wait for the monitoring thread to complete before returning.
The peak memory usage is displayed in GB for each GPU index.
launch_memory_monitor(monitor_interval=1.0)
Launch a GPU memory monitor in a separate thread.
Parameters:
monitor_interval (float) – Time interval between memory checks in seconds
Returns:
The monitor instance that was launched
Return type: