audioTimeScaler - Apply time scaling to streaming audio - MATLAB (original) (raw)
Apply time scaling to streaming audio
Description
The audioTimeScaler
object performs audio time scale modification (TSM) independently across each input channel.
To modify the time scale of streaming audio:
- Create the
audioTimeScaler
object and set its properties. - Call the object with arguments, as if it were a function.
To learn more about how System objects work, see What Are System Objects?
Creation
Syntax
Description
`aTS` = audioTimeScaler
creates an object,aTS
, that performs audio time scale modification independently across each input channel over time.
`aTS` = audioTimeScaler(`speedupFactor`)
sets the SpeedupFactor
property tospeedupFactor
.
`aTS` = audioTimeScaler(___,`'Name',Value`)
sets each property Name
to the specified Value
. Unspecified properties have default values.
Example: aTS = audioTimeScaler(1.2,'Window',sqrt(hann(1024,'periodic')),'OverlapLength',768)
creates an object, aTS
, that increases the tempo of audio by 1.2 times its original speed using a periodic 1024-point Hann window and a 768-point overlap.
Properties
Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and therelease function unlocks them.
If a property is tunable, you can change its value at any time.
For more information on changing property values, seeSystem Design in MATLAB Using System Objects.
Speedup factor, specified as a positive real scalar.
Tunable: Yes
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Domain of the input signal, specified as "Time"
or"Frequency"
.
Data Types: char
| string
Analysis window, specified as a real vector.
Note
If using audioTimeScaler
with frequency-domain input, you must specify Window
as the same window used to transformaudioIn
to the frequency domain.
Data Types: single
| double
Overlap length of adjacent analysis windows, specified as a nonnegative integer.
Note
If using audioTimeScaler
with frequency-domain input, you must specify OverlapLength
as the same overlap length used to transform audioIn
to a time-frequency representation.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
FFT length, specified as a positive integer. The default, []
, means that the FFT length is equal to the number of rows in the input signal.
Dependencies
To enable this property, set InputDomain
to'Time'
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Apply identity phase locking, specified as true
orfalse
.
Data Types: logical
Usage
Syntax
Description
[audioOut](#mw%5Fe9e78dd5-31fd-4109-99b3-5e3b322108e6) = aTS([audioIn](#mw%5F0c65260e-1ef9-44ac-a58b-9a93bb330412))
applies time-scale modification to the input, audioIn
, and returns the time-scaled output, audioOut
.
Input Arguments
Input audio, specified as a column vector or matrix. HowaudioTimeScaler
interprets audioIn
depends on the InputDomain property.
- If
InputDomain
is set to"Time"
,audioIn
must be a real _N_-by-1 column vector or _N_-by-C matrix. The number of rows, N, must be equal to or less than the hop length (size(`audioIn`,1) <= numel([Window](audiotimescaler-system-object.html#mw%5Fc34df87e-4082-4320-a2e1-eb7c65eecdd7))-[OverlapLength](audiotimescaler-system-object.html#mw%5F0a19a4a1-0a6c-4066-bbe8-b854479e58fc)
). Columns of a matrix are interpreted as individual channels. - If
InputDomain
is set to"Frequency"
, specifyaudioIn
as a real or complex _NFFT_-by-1 column vector or_NFFT_-by-C matrix. The number of rows,NFFT, is the number of points in the DFT calculation, and is set on the first call to the audio time scaler. NFFT must be greater than or equal to the window length (size(`audioIn`,1) >= numel(`Window`)
). Columns of a matrix are interpreted as individual channels.
Data Types: single
| double
Complex Number Support: Yes
Output Arguments
Time-stretched audio, returned as a column vector or matrix.
Data Types: single
| double
Object Functions
To use an object function, specify the System objectâ„¢ as the first input argument. For example, to release system resources of a System object named obj
, use this syntax:
step | Run System object algorithm |
---|---|
release | Release resources and allow changes to System object property values and input characteristics |
reset | Reset internal states of System object |
Examples
To minimize artifacts caused by windowing, create a square root Hann window capable of perfect reconstruction. Use iscola
to verify the design.
win = sqrt(hann(1024,'periodic')); overlapLength = 896; iscola(win,overlapLength)
Create an audioTimeScaler
with a speedup factor of 1.5
. Change the value of alpha
to hear the effect of the speedup factor.
alpha = 1.5;
aTS = audioTimeScaler( ...
'SpeedupFactor',alpha, ...
'Window',win, ...
'OverlapLength',overlapLength);
Create a dsp.AudioFileReader
object to read frames from an audio file. The length of frames input to the audio time scaler must be less than or equal to the analysis hop length defined in audioTimeScaler
. To minimize buffering, set the samples per frame of the file reader to the analysis hop length.
hopLength = numel(aTS.Window) - overlapLength; fileReader = dsp.AudioFileReader('Counting-16-44p1-mono-15secs.wav', ... 'SamplesPerFrame',hopLength);
Create an audioDeviceWriter
to write frames to your audio device. Use the same sample rate as the file reader.
deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);
In an audio stream loop, read a frame the file, apply time scale modification, and then write a frame to the device.
while ~isDone(fileReader) audioIn = fileReader(); audioOut = aTS(audioIn); deviceWriter(audioOut); end
As a best practice, release your objects once done.
release(deviceWriter) release(fileReader) release(aTS)
Create a window capable of perfect reconstruction. Use iscola
to verify the design.
win = kbdwin(512); overlapLength = 256; iscola(win,overlapLength)
Create an audioTimeScaler
with a speedup factor of 0.8
. Set InputDomain
to "Frequency"
and specify the window and overlap length used to transform time-domain audio to the frequency domain. Set LockPhase
to true
to increase the fidelity in the time-scaled output.
alpha = 0.8; timeScaleModification = audioTimeScaler( ... "SpeedupFactor",alpha, ... "InputDomain","Frequency", ... "Window",win, ... "OverlapLength",overlapLength, ... "LockPhase",true);
Create a dsp.AudioFileReader
object to read frames from an audio file. Create a dsp.STFT
object to perform a short-time Fourier transform on streaming audio. Specify the same window and overlap length you used to create the audioTimeScaler
. Create an audioDeviceWriter
object to write frames to your audio device.
fileReader = dsp.AudioFileReader('RockDrums-44p1-stereo-11secs.mp3','SamplesPerFrame',numel(win)-overlapLength);
shortTimeFourierTransform = dsp.STFT('Window',win,'OverlapLength',overlapLength,'FFTLength',numel(win));
deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);
In an audio stream loop:
- Read a frame from the file.
- Input the frame to the STFT. The
dsp.STFT
object performs buffering. - Apply time scale modification.
- Write the modified audio to your audio device.
while ~isDone(fileReader) x = fileReader(); X = shortTimeFourierTransform(x); y = timeScaleModification(X); deviceWriter(y); end
As a best practice, release your objects once done.
release(fileReader) release(shortTimeFourierTransform) release(timeScaleModification) release(deviceWriter)
Algorithms
audioTimeScaler
uses the same phase vocoder algorithm as stretchAudio and is based on the descriptions in [1] and [2].
References
[1] Driedger, Johnathan, and Meinard Müller. "A Review of Time-Scale Modification of Music Signals." Applied Sciences. Vol. 6, Issue 2, 2016.
[2] Driedger, Johnathan. "Time-Scale Modification Algorithms for Music Audio Signals." Master's thesis, Saarland University, 2011.
Extended Capabilities
Version History
Introduced in R2019b