Speech enhancement with masking properties in eigen-domain for colored noise (original) (raw)

Signal Subspace Speech Enhancement for Audible Noise Reduction

Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., 2000

A novel subspace-based speech enhancement scheme based on a criterion of audible noise reduction is considered. Masking properties of the human auditory system is used to define the audible noise quantity in the eigen-domain. Subsequently, an audible noise reduction scheme is developed based on a signal subspace technique. We derive the eigendecomposition of the estimated speech autocorrelation matrix with the assumption of white noise and outline the implementation of our proposed scheme. We further extend the scheme to the colored noise case. Simulation results show that our proposed scheme outperforms many existing subspace methods in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ) and informal listening tests.

Improved embedded pre-whitening subspace approach for enhancing speech contaminated by colored noise

Speech Communication, 2018

An improved embedded pre-whitening subspace approach based on spectral domain constraints is proposed for enhancement of speech contaminated by colored noise. The main contribution of this work is to propose a particular non-unitary spectral transformation of the residual noise. This non-unitary transform is based on simultaneous diagonalization of the clean speech and noise covariance matrices. With this particular transformation, the optimization problem results from subspace approach based on spectral domain constraints can be solved without any restrictions on the form of contributed matrices. Under some theoretical assumptions the proposed method is reduced to the Hu and Loizou method, which is similar to the proposed method except for the gain matrix. Comparing with the Hu and Loizou method, speech enhancement measures for our approach show significant improvement in enhancing the TIMIT sentences corrupted by brown, pink and multi-talker babble noises. Also, comparisons with two non embedded pre-whitening subspace methods, PKLT and PKLT-VRE, show that the proposed method is comparable with these two methods.

An invertible frequency eigendomain transformation for masking-based subspace speech enhancement

IEEE Signal Processing Letters, 2005

Masking properties have been widely exploited in speech enhancement techniques, especially those implemented in the spectral domain. The incorporation of auditory masking in a subspace technique invariably requires a transformation linking the frequency and eigendomains. In this letter, an invertible transformation between the frequency and eigendomains is derived. The proposed transformation is verified through a conventional masking-based subspace speech-enhancement method. Simulation results show that our proposed transformation for speech enhancement outperforms the conventional transformation in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ), and listening tests.

Perceptual Kalman filtering for speech enhancement in colored noise

2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004

A new method for speech enhancement in colored noise is proposed in this paper. A Kalman filter concatenated with a post-filter based on masking properties of human auditory systems is proposed for the problem. A recursive approach to compute the noise covariance matrix is used for estimating the colored noise statistics. In the post-filter, both time domain masking properties and frequency domain masking properties are taken into account. From the calculated masking level, the noisy speech spectrum is adjusted accordingly. Simulation results show that the proposed approach has the best performance compared with other recent methods, evaluated with PESQ scores.

Speech Enhancement with Geometric Advent of Spectral Subtraction using Connected Time-Frequency Regions Noise Estimation

Research Journal of Applied Sciences, Engineering and Technology, 2013

Speech enhancement with Geometric Advent of Spectral subtraction using connected time-frequency regions noise estimation aims to de-noise or reduce background noise from the noisy speech for better quality, pleasantness and improved intelligibility. Numerous enhancement methods are proposed including spectral subtraction, subspace, statistical with different noise estimations. The traditional spectral subtraction techniques are reasonably simple to implement and suffer from musical noise. This study addresses the new approach for speech enhancement which has minimized the insufficiencies in traditional spectral subtraction algorithms using MCRA. This approach with noise estimation has been evolved with PESQ, the ITU-T standard; Frequency weighted segmental SNR and weighted spectral slope. The analysis shows that Geometric approach with time-frequency connected regions has improved results than old-fashioned spectral subtraction algorithms. The normal hearing tests has suggested that new approach has lower audible musical noise.

SPEECH ENHANCEMENT USING SPECTRAL SUBTRACTION TECHNIQUE WITH MINIMIZED CROSS SPECTRAL COMPONENTS

The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several approaches for speech enhancement .earlier approaches didn't consider cross spectral terms into account. Cross spectral terms become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques.

Extension of the local subspace method to enhancement of speech with colored noise

2008

Based on dynamic features of human speech, the local projection (LP) method has been adapted to the enhancement of speech corrupted by white noise. As an extension of the LP method, a strategy with two rounds of projection is introduced to enhance the speech contaminated with colored noise. Colored noise mainly resides in a low dimensional subspace, and is assumed to be stationary in this communication. At step one, a noise dominated subspace is first estimated with colored noise obtained from speech silence frame. Then for the reference phase point, the components, projected into the noise dominated subspace, are deleted and the enhanced speech is reconstructed with the remaining components. The residual error of the output of step one tends to distribute uniformly on each direction. So at step two, the LP method is further applied to the output of step one, treating the residual error as white noise. An adaption of this strategy to continuous speech is performed. The results show that this strategy is more effective than the LP method in enhancing speech corrupted by colored noise, and is comparable to two typical speech enhancement methods. r

EIGENDOMAIN-BASED NOISE ESTIMATION WITH THE MINIMUM STATISTICS APPROACH

2006

One of the challenges for single-channel speech enhancement is to estimate the noise statistics from a signal containing both speech and noise. In this paper, we present a technique for eigendomain-based noise estimation that uses minimum statistics to control the adaptation rate along each eigenvector. We demonstrate that this technique gives robust noise tracking for non-stationary noise.

Effect of Speech enhancement using spectral subtraction on various noisy environment

IRJET, 2022

Analysis Modification Synthesis (AMS) plays a key role in many audio signal processing applications, separating the audio stream into time intervals with speech activity and time intervals without speech. Many features have been introduced into the literature that reflect the existence of language. Therefore, this article presents a structured overview of several established speech enhancement features targeting different characteristics of speech. Categorize features in terms of their exploitable properties. B. Evaluate performance in a background noise environment, different input SNR categories, and some dedicated functions. Our analysis shows how to select promising VAD features and find reasonable tradeoffs between performance and complexity. To estimate clean speech using the Fast Fourier Transform (FFT), we emphasize the noise spectrum estimated during speech, subtract it from the noisy speech spectrum, and consider the average amplitude of the clean spectrum. and tried to develop a new method to minimize the spectrum of loud sounds. The noise reduction algorithm uses MATLAB software to semi- duplicate the noisy speech data (overlap-add processing) and use FFT to calculate the corresponding amplitude spectrum to remove noise from the noisy speech. and performed by reversing the audio in time. Reconstructed with the Inverse Fast Fourier Transform (IFFT).