GELU — PyTorch 2.7 documentation (original) (raw)

class torch.nn.GELU(approximate='none')[source][source]

Applies the Gaussian Error Linear Units function.

GELU(x)=x∗Φ(x)\text{GELU}(x) = x * \Phi(x)

where Φ(x)\Phi(x) is the Cumulative Distribution Function for Gaussian Distribution.

When the approximate argument is ‘tanh’, Gelu is estimated with:

GELU(x)=0.5∗x∗(1+Tanh(2/π∗(x+0.044715∗x3)))\text{GELU}(x) = 0.5 * x * (1 + \text{Tanh}(\sqrt{2 / \pi} * (x + 0.044715 * x^3)))

Parameters

approximate (str, optional) – the gelu approximation algorithm to use:'none' | 'tanh'. Default: 'none'

Shape:

../_images/GELU.png

Examples:

m = nn.GELU() input = torch.randn(2) output = m(input)