CUDAGraph — PyTorch 2.7 documentation (original) (raw)

class torch.cuda.CUDAGraph[source][source]¶

Wrapper around a CUDA graph.

Warning

This API is in beta and may change in future releases.

capture_begin(pool=None, capture_error_mode='global')[source][source]¶

Begin capturing CUDA work on the current stream.

Typically, you shouldn’t call capture_begin yourself. Use graph or make_graphed_callables(), which call capture_begin internally.

Parameters

pool (optional) – Token (returned by graph_pool_handle() orother_Graph_instance.pool()) that hints this graph may share memory with the indicated pool. See Graph memory management.
capture_error_mode (str, optional) – specifies the cudaStreamCaptureMode for the graph capture stream. Can be “global”, “thread_local” or “relaxed”. During cuda graph capture, some actions, such as cudaMalloc, may be unsafe. “global” will error on actions in other threads, “thread_local” will only error for actions in the current thread, and “relaxed” will not error on these actions. Do NOT change this setting unless you’re familiar with cudaStreamCaptureMode

End CUDA graph capture on the current stream.

After capture_end, replay may be called on this instance.

Typically, you shouldn’t call capture_end yourself. Use graph or make_graphed_callables(), which call capture_end internally.

debug_dump(debug_path)[source][source]¶

Parameters

debug_path (required) – Path to dump the graph to.

Calls a debugging function to dump the graph if the debugging is enabled via CUDAGraph.enable_debug_mode()

enable_debug_mode()[source][source]¶

Enable debugging mode for CUDAGraph.debug_dump.

Return an opaque token representing the id of this graph’s memory pool.

This id can optionally be passed to another graph’s capture_begin, which hints the other graph may share the same memory pool.

Replay the CUDA work captured by this graph.

Delete the graph currently held by this instance.