torch.nn.utils.clip_grad_norm_ — PyTorch 2.7 documentation (original) (raw)

torch.nn.utils.clip_grad_norm_(parameters, max_norm, norm_type=2.0, error_if_nonfinite=False, foreach=None)[source][source]¶

Clip the gradient norm of an iterable of parameters.

The norm is computed over the norms of the individual gradients of all parameters, as if the norms of the individual gradients were concatenated into a single vector. Gradients are modified in-place.

This function is equivalent to torch.nn.utils.get_total_norm() followed bytorch.nn.utils.clip_grads_with_norm_() with the total_norm returned by get_total_norm.

Parameters

parameters (Iterable _[_Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized
max_norm (float) – max norm of the gradients
norm_type (float) – type of the used p-norm. Can be 'inf' for infinity norm.
error_if_nonfinite (bool) – if True, an error is thrown if the total norm of the gradients from parameters is nan,inf, or -inf. Default: False (will switch to True in the future)
foreach (bool) – use the faster foreach-based implementation. If None, use the foreach implementation for CUDA and CPU native tensors and silently fall back to the slow implementation for other device types. Default: None

Returns

Total norm of the parameter gradients (viewed as a single vector).

Return type

Tensor