Voiced Speech Enhancement Based on Adaptive Filtering of Selected Intrinsic Mode Functions (original) (raw)

Empirical Mode Decomposition for Advanced Speech Signal Processing

Journal of Signal Processing, 2013

Empirical mode decomposition (EMD) is a newly developed tool to analyze nonlinear and non-stationary signals. It is used to decompose any signal into a finite number of time varying subband signals termed as intrinsic mode functions (IMFs). Such data adaptive decomposition is recently used in speech enhancement. This study presents the concept of EMD and its application to advanced speech signal processing paradigms including speech enhancement by soft-thresholding, voiced/unvoiced (V/Uv) speech discrimination and pitch estimation. The speech processing is frequently performed in the transformed domain and the transformation is usually achieved by traditional signal analysis techniques i.e. Fourier and wavelet transformations. These analysis methods employ priori basis function and it is not suitable for data adaptive analysis for non-stationary signal like speech. Recently, EMD is taken much attention for speech signal processing in data adaptive way. Several EMD based potential soft-thresholding algorithms for speech enhancement are discussed here. The V/Uv discrimination is an important concern in speech processing. It is usually performed by using acoustic features. The training data is used to determine the threshold for classification. The EMD based data adaptive thresholding approach is developed for V/Uv discrimination without any training phase. Noticeable improvement is achieved with the application of EMD in pitch estimation of noisy speech signals. The related experimental results are also presented to realize the effectiveness of EMD in advanced speech processing algorithms.

Speech enhancement using empirical mode decomposition and the Teager–Kaiser energy operator

The Journal of the Acoustical Society of America, 2014

In this paper a speech denoising strategy based on time adaptive thresholding of intrinsic modes functions (IMFs) of the signal, extracted by empirical mode decomposition (EMD), is introduced. The denoised signal is reconstructed by the superposition of its adaptive thresholded IMFs. Adaptive thresholds are estimated using the Teager-Kaiser energy operator (TKEO) of signal IMFs. More precisely, TKEO identifies the type of frame by expanding differences between speech and non-speech frames in each IMF. Based on the EMD, the proposed speech denoising scheme is a fully data-driven approach. The method is tested on speech signals with different noise levels and the results are compared to EMD-shrinkage and wavelet transform (WT) coupled with TKEO. Speech enhancement performance is evaluated using output signal to noise ratio (SNR) and perceptual evaluation of speech quality (PESQ) measure. Based on the analyzed speech signals, the proposed enhancement scheme performs better than WT-TKEO and EMD-shrinkage approaches in terms of output SNR and PESQ. The noise is greatly reduced using time-adaptive thresholding than universal thresholding. The study is limited to signals corrupted by additive white Gaussian noise.

Intrinsic Mode Functions

2018

In this paper a new method for voiced speech enhancement combining the Empirical Mode Decomposition (EMD) and the Adaptive Center Weighted Average (ACWA) filter is introduced. Noisy signal is decomposed adaptively into intrinsic oscillatory components called Intrinsic Mode Functions (IMFs). Since voiced speech structure is mostly distributed on both medium and low frequencies, the shorter scale IMFs of the noisy signal are beneath noise, however the longer scale ones are less noisy. Therefore, the main idea of the proposed approach is to only filter the shorter scale IMFs, and to keep the longer scale ones unchanged. In fact, the filtering of longer scale IMFs will introduce distortion rather than reducing noise. The denoising method is applied to several voiced speech signals with different noise levels and the results are compared with wavelet approach, ACWA filter and EMD-ACWA (filtering of all IMFs using ACWA filter). Relying on exhaustive simulations, we show the efficiency of the proposed method for reducing noise and its superiority over other denoising methods, i.e., to improve Signal-to-Noise Ratio (SNR), and to offer better listening quality based on a Perceptual Evaluation of Speech Quality (PESQ). The present study is limited to signals corrupted by additive white Gaussian noise.

Speech Enhancement via EMD

EURASIP Journal on Advances in Signal Processing, 2008

In this study, two new approaches for speech signal noise reduction based on the empirical mode decomposition (EMD) recently introduced by Huang et al. (1998) are proposed. Based on the EMD, both reduction schemes are fully data-driven approaches. Noisy signal is decomposed adaptively into oscillatory components called intrinsic mode functions (IMFs), using a temporal decomposition called sifting process. Two strategies for noise reduction are proposed: filtering and thresholding. The basic principle of these two methods is the signal reconstruction with IMFs previously filtered, using the minimum mean-squared error (MMSE) filter introduced by I. Y. Soon et al. (1998), or thresholded using a shrinkage function. The performance of these methods is analyzed and compared with those of the MMSE filter and wavelet shrinkage. The study is limited to signals corrupted by additive white Gaussian noise. The obtained results show that the proposed denoising schemes perform better than the MMSE filter and wavelet approach.

An Improved Speech De-noising Method based on Empirical Mode Decomposition

Generally, Speech enhancement aims to improve speech quality and intelligibility of a noise contaminated speech signal by using various signal processing approaches. Removal of a noise from a noisy speech is a common problem; already a vast research was carried out in earlier. However, due to the characteristics of various types of noises, the approaches proposed in earlier are not applicable for all types of noises. In addition, the earlier approaches didn't focus on the non-linear and non-stationary characteristics on noise environments. EMD is a filtering approach performs efficiently for non-stationary environments. This paper proposes a novel EMDF approach with the inspiration of thresholding to remove the noise from noisy speech sample. The proposed approach also developed a method to select the IMF index for separating the residual low-frequency noise components from the speech estimate, based on the IMF statistics. An experimental study was also done on various types of noise contaminated speech samples like babble noise, restaurant noise and car interior noise at various strengths.

Nonlocal means estimation of intrinsic mode functions for speech enhancement

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 2019

The main aim of this paper is to introduce a new approach to enhance speech signals by exploring the advantages of nonlocal means (NLM) estimation and empirical mode decomposition. NLM, a patch-based denoising method, is extensively used for two-dimensional signals like images. However, its use for one-dimensional signals has been attracting more attention recently. The NLM-based approach is quite useful for removing low-frequency noises based on nonlocal similarities present among samples of the signal. However, there is an issue of under averaging in the high-frequency regions. The temporal and spectral characteristics of the speech signal are changing markedly over time. Thus NLM is conventionally not effective to remove the noise components from the speech signal, unlike image denoising. To address this issue, initially, the speech signal is decomposed into oscillatory components called intrinsic mode functions (IMFs) by using a temporal decomposition technique known as the sifting process. Each IMF represents signal information at a certain scale or frequency band. The IMFs do not have abrupt power spectral changes over time. The decomposed IMFs are processed using NLM estimation based on nonlocal similarities for better speech enhancement. The simulation result shows that the proposed method gives better performance in terms of subjective and objective quality measures. Its performance is evaluated for white, factory, and babble noises at different signal to noise ratios.

Application of Variational Mode Decomposition on Speech Enhancement

Proceedings of the Second International Conference on Research in Intelligent and Computing in Engineering, 2017

Enhancement of speech signal and reduction of noise from speech is still a challenging task for researchers. Out of many methods signal decomposition method attracts a lot in recent years. Empirical Mode Decomposition (EMD) has been applied in many problems of decomposition. Recently Variational Mode Decomposition (VMD) is introduced as an alternative to it that can easily separate the signals of similar frequencies. This paper proposes the signal decomposition algorithm as VMD for denoising and enhancement of speech signal. VMD decomposes the recorded speech signal into several modes. Speech contaminated with different types of noise is adaptively decomposed into various components is said to be Intrinsic Mode Functions (IMFs) by sifting process as in Empirical Mode decomposition (EMD) method. Next to it the denoising technique is applied using VMD. Each of the decomposed modes is compact. The simulation result shows that the proposed method is well suited for the speech enhancement and removal of noise by restoring the original signal.

UNFOLDED HARDWARE ARCHITECTURE OF EMPIRICAL MODE DECOMPOSITION FOR SPEECH ENHANCEMENT

IAEME PUBLICATION, 2020

In this paper, we have suggested novel dedicated unfolded hardware architecture of Empirical Mode Decomposition (EMD) for speech enhancement. In the EMD process, by using a temporary decomposition technique the speech signal is fragmented into oscillatory components called Intrinsic Mode Functions (IMFs). The process is known as the sifting process. Each IMF represents signal information at a certain scale or frequency band. The denoised signal is then reconstructed from the IMFs that are signal dominated. Therefore, we have used one core of the hardware blocks which includes extrema extraction, envelope generation, mean calculation and subtraction hardware core for the four times instead of using one core of hardware blocks and repeating it. This is done to make the system fast. Dedicated hardware ensures the stability, predictability, and availability stability of an algorithm. The comparison between the outputs of software (MATLAB) simulation and hardware architecture verifies the efficiency of the proposed architecture. The hardware architecture proposed is verified by implementing on Zed Board (Zynq-7000 All Programmable SoC) using the Xilinx system generator-2016.2.

Speech enhancement using adaptive thresholding based on gamma distribution of Teager energy operated intrinsic mode functions

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 2019

This paper introduces a new speech enhancement algorithm based on the adaptive threshold of intrinsic mode functions (IMFs) of noisy signal frames extracted by empirical mode decomposition. Adaptive threshold values are estimated by using the gamma statistical model of Teager energy operated IMFs of noisy speech and estimated noise based on symmetric Kullback–Leibler divergence. The enhanced speech signal is obtained by a semisoft thresholding function, which is utilized by threshold IMF coefficients of noisy speech. The method is tested on the NOIZEUS speech database and the proposed method is compared with wavelet-shrinkage and EMD-shrinkage methods in terms of segmental SNR improvement (SegSNR), weighted spectral slope (WSS), and perceptual evaluation of speech quality (PESQ). Experimental results show that the proposed method provides a higher SegSNR improvement in dB, lower WSS distance, and higher PESQ scores than wavelet-shrinkage and EMD-shrinkage methods. The proposed metho...

EMD BASED SPEECH ENHANCEMENT USING SOFT AND HARD THRESHOLD TECHNIQUES

In last few decades many attempts have been made on speech signals to eliminates the noise. Purpose of use any speech enhancement algorithm is to eliminate noises in variety of environments; most prominent of which are telecommunication applications. The purpose of this paper is to development of a novel speech enhancement algorithm which offers superior noise reduction over current methods. This Research paper work demonstrates a novel time domain speech enhancement algorithm for speech signals called empirical mode decomposition (EMD). EMD decomposes the speech signal corrupted by noise signal into a finite number of band limited signals known as intrinsic mode functions (IMFs), using iterative procedure called sifting process. These IMFs are denoised by using two different techniques first method is IMFs thresholding method or direct method of speech enhancement and another technique is IMF frame based method are discussed in this work. Both the methods use soft and hard threshold techniques for denoising the IMFs which are obtained from EMD. These algorithms are implemented empirically by using MATLAB software on real time speech data. Experimental Results shows that IMFs frame based method superior than direct method this can be tested by adding noises with the different SNR values to the clean speech.