torch.stft — PyTorch 2.7 documentation (original) (raw)

torch.stft(input, n_fft, hop_length=None, win_length=None, window=None, center=True, pad_mode='reflect', normalized=False, onesided=None, return_complex=None, align_to_window=None)[source][source]

Short-time Fourier transform (STFT).

Warning

From version 1.8.0, return_complex must always be given explicitly for real inputs and return_complex=False has been deprecated. Strongly prefer return_complex=True as in a future pytorch release, this function will only return complex tensors.

Note that torch.view_as_real() can be used to recover a real tensor with an extra last dimension for real and imaginary components.

Warning

From version 2.1, a warning will be provided if a window is not specified. In a future release, this attribute will be required. Not providing a window currently defaults to using a rectangular window, which may result in undesirable artifacts. Consider using tapered windows, such as torch.hann_window().

The STFT computes the Fourier transform of short overlapping windows of the input. This giving frequency components of the signal as they change over time. The interface of this function is modeled after (but not a drop-in replacement for) librosa stft function.

Ignoring the optional batch dimension, this method computes the following expression:

X[ω,m]=∑k=0win_length-1window[k] input[m×hop_length+k] exp⁡(−j2π⋅ωkn_fft),X[\omega, m] = \sum_{k = 0}^{\text{win\_length-1}}% \text{window}[k]\ \text{input}[m \times \text{hop\_length} + k]\ % \exp\left(- j \frac{2 \pi \cdot \omega k}{\text{n\_fft}}\right),

where mm is the index of the sliding window, and ω\omega is the frequency 0≤ω<n_fft0 \leq \omega < \text{n\_fft} for onesided=False, or 0≤ω<⌊n_fft/2⌋+10 \leq \omega < \lfloor \text{n\_fft} / 2 \rfloor + 1 for onesided=True.

Returns either a complex tensor of size (∗×N×T)(* \times N \times T) ifreturn_complex is true, or a real tensor of size (∗×N×T×2)(* \times N \times T \times 2). Where ∗* is the optional batch size ofinput, NN is the number of frequencies where STFT is applied and TT is the total number of frames used.

Warning

This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.

Parameters

Returns

A tensor containing the STFT result with shape (B?, N, T, C?) where

Return type

Tensor