Audio Processing - MATLAB & Simulink (original) (raw)

Main Content

Extend deep learning workflows with audio and speech processing applications

Apply deep learning to audio and speech processing applications by using Deep Learning Toolbox™ together with Audio Toolbox™. For signal processing applications, see Signal Processing. For applications in wireless communications, see Wireless Communications.

Apps

Signal Labeler	Label signal attributes, regions, and points of interest, and extract features

Functions

expand all

Data Management and Augmentation

Feature Extraction

Pretrained Networks

Blocks

expand all

VGGish

YAMNet

YAMNet	YAMNet sound classification network (Since R2021b)
Sound Classifier	Classify sounds in audio signal (Since R2021b)

OpenL3

CREPE

CREPE	CREPE deep pitch estimation neural network (Since R2023a)
Deep Pitch Estimator	Estimate pitch with CREPE deep learning neural network (Since R2023a)

Topics

Deep Learning for Audio Applications (Audio Toolbox)
Learn common tools and workflows to apply deep learning to audio applications.
Classify Sound Using Deep Learning (Audio Toolbox)
Train, validate, and test a simple long short-term memory (LSTM) to classify sounds.
Adapt Pretrained Audio Network for New Data Using Deep Network Designer
This example shows how to interactively adapt a pretrained network to classify new audio signals using Deep Network Designer.
Audio Transfer Learning Using Experiment Manager
Configure an experiment that compares the performance of multiple pretrained networks applied to a speech command recognition task using transfer learning.
Compare Speaker Separation Models
Compare the performance, size, and speed of multiple deep learning speaker separation models.
Speaker Identification Using Custom SincNet Layer and Deep Learning
Perform speech recognition using a custom deep learning layer that implements a mel-scale filter bank.
Dereverberate Speech Using Deep Learning Networks
Train a deep learning model that removes reverberation from speech.
Sequential Feature Selection for Audio Features
This example shows a typical workflow for feature selection applied to the task of spoken digit recognition.
Train Spoken Digit Recognition Network Using Out-of-Memory Audio Data
This example trains a spoken digit recognition network on out-of-memory audio data using a transformed datastore.
Train Spoken Digit Recognition Network Using Out-of-Memory Features
This example trains a spoken digit recognition network on out-of-memory auditory spectrograms using a transformed datastore.
Investigate Audio Classifications Using Deep Learning Interpretability Techniques
This example shows how to use interpretability techniques to investigate the predictions of a deep neural network trained to classify audio data.
Accelerate Audio Deep Learning Using GPU-Based Feature Extraction
Leverage GPUs for feature extraction to decrease the time required to train an audio deep learning model.
AI for Speech Command Recognition (Audio Toolbox)
Build, train, compress, and deploy a deep learning model for speech command recognition.
- STEP 1: Train Deep Learning Network for Speech Command Recognition (Audio Toolbox)
- STEP 2: Prune and Quantize Speech Command Recognition Network (Audio Toolbox)
- STEP 3: Apply Speech Command Recognition Network in Simulink (Audio Toolbox)
- STEP 4: Apply Speech Command Recognition Network in Smart Speaker Simulink Model (Audio Toolbox)
- STEP 5: Deploy Smart Speaker Model on Raspberry Pi (Audio Toolbox)