A General Optimization Procedure For Spectral Speech Enhancement Methods (original) (raw)

A data-driven approach to optimizing spectral speech enhancement methods for various error criteria

Speech Communication, 2007

Gain functions for spectral noise suppression have been derived in literature for some error criteria and statistical models. These gain functions are only optimal when the statistical model is correct and the speech and noise spectral variances are known. Unfortunately, the speech distributions are unknown and can at best be determined conditionally on the estimated spectral variance. We show that the ''decision-directed'' approach for speech spectral variance estimation can have an important bias at low SNRs, which generally leads to too much speech suppression. To correct for such estimation inaccuracies and adapt to the unknown speech statistics, we propose a general optimization procedure, with two gain functions applied in parallel. A conventional algorithm is run in the background and is used for a priori SNR estimation only. For the final reconstruction a different gain function is used, optimized for a wide range of signal-tonoise ratios. The gain function providing for the reconstruction is trained on a speech database, by minimizing a relevant error criterion. The procedure is illustrated for several error criteria. The method compares favorably to current state-of-the-art methods, and needs less smoothing in the decision-directed spectral variance estimator.

Generalized maximum a posteriori spectral amplitude estimation for speech enhancement

Speech Communication, 2015

Spectral restoration methods for speech enhancement aim to remove noise components in noisy speech signals by using a gain function in the spectral domain. How to design the gain function is one of the most important parts for obtaining enhanced speech with good quality. In most studies, the gain function is designed by optimizing a criterion based on some assumptions of the noise and speech distributions, such as minimum mean square error (MMSE), maximum likelihood (ML), and maximum a posteriori (MAP) criteria. The MAP criterion shows advantage in obtaining a more reliable gain function by incorporating a suitable prior density. However, it has a problem as several studies showed: although MAP based estimator effectively reduces noise components when the signal-to-noise ratio (SNR) is low, it brings large speech distortion when the SNR is high. For solving this problem, we have proposed a generalized maximum a posteriori spectral amplitude (GMAPA) algorithm in designing a gain function for speech enhancement. The proposed GMAPA algorithm dynamically specifies the weight of prior density of speech spectra according to the SNR of the testing speech signals to calculate the optimal gain function. When the SNR is high, GMAPA adopts a small weight to prevent overcompensations that may result in speech distortions. On the other hand, when the SNR is low, GMAPA uses a large weight to avoid disturbance of the restoration caused by measurement noises. In our previous study, it has been proven that the weight of the prior density plays a crucial role to the GMAPA performance, and the weight is determined based on the SNR in an utterance-level. In this paper, we propose to compute the weight with the consideration of time-frequency correlations that result in a more accurate estimation of the gain function. Experiments were carried out to evaluate the proposed algorithm on both objective tests and subjective tests. The experimental results obtained from objective tests indicate that GMAPA is promising compared to several well-known algorithms at both high and low SNRs. The results of subjective listening tests indicate that GMAPA provides significantly higher sound quality than other speech enhancement algorithms.

Iterative speech enhancement with spectral constraints

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

A new and improved iterative speech enhancement technique based on spectral constraints is presented in this paper. The iterative technique, originally formulated by Lim and Oppenheim, attempts to solve for the maximum likelihood estimate of a speech waveform in additive white noise. The new approach applies inter-and intra-frame spectral constraints to ensure convergence to reasonable values and hence improve speech quality. An extremely eficient technique for applying these consrraints is in the use of line spectral pair (LSP) coeficiertts. The inter-frame constraints ensures more speech-like formant trajectories than those found in the unconstrained approach. Results from speech degraded by additive white Gaussian noise show noticeable quality improvement.

Spectral Subtractive Type Speech Enhancement Methods

Advances in Electrical and Electronic Engineering, 2011

In this paper spectral subtractive method and some of its modification are compared. Performance of spectral subtraction, its limitations, artifacts introduced by it, and spectral subtraction modifications for eliminating these artifacts are discussed in the paper in details. The algorithms are compared based on SNR improvement introduced by them. Spectrograms of speech enhanced by the algorithms, which show the algorithms performance and degree of speech distortion, are also presented.

A Novel Spectral Conversion Based Approach for Noisy Speech Enhancement Huy-Khoi DO, Trung-Nghia PHUNG, Huu-Cong NGUYEN, Van-Tao NGUYEN, and Quang-Vinh

Present noisy speech enhancements algorithms are efficiently used for additive noise but not very good for convolutive noise as reverberation. And even for additive noise, the estimation of noise, when only one microphone source is provided, is based on the assumption of a slowly varying noise environment, commonly assumed as stationary noise. However, real noise is non-stationary noise, which difficult to be efficiently estimated. Spectral conversion can be used for predicting the vocal tract (spectral envelope) parameters of noisy speech without estimating the parameters of the noise source. Therefore, it can be applied to a general speech enhancement model, for both stationary and non-stationary additive noise environment, as well as convolutive noise environment, when only one microphone source is provided. In this paper, we propose a spectral conversion based speech enhancement method. The experimental results show that our method outperforms traditional methods.

Developments in Spectral Subtraction for Speech Enhancement

2012

Speech enhancement aims to improve speech quality by using various techniques and algorithms. Over the past several years there has been considerable attention focused on the problem of enhancement of speech degraded by additive background noise. Background noise suppression has many applications. Using mobile in a noisy environment like in streets or in a car is an obvious application, removing the background noise when sending speech from the cockpit of an airplane to the ground or to the cabin. The spectral subtractive algorithm is historically one of the first algorithms proposed for additive background noise and it has gone through many modifications with time. This is a review paper and its objective is to provide an overview of the variety of spectral subtraction techniques that have been proposed for enhancement of speech degraded by additive background noise during past decades . Section I gives the Introduction to Speech enhancement and explain basic Spectral Subtraction t...

Modified magnitude spectral subtraction methods for speech enhancement

2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), 2017

In this paper, the method proposed by Boll, noise is estimated from the initial few frames of silence region and is subtracted from noisy speech spectral magnitude, the subsequent proposing steps are residual noise removal and additional signal attenuation to improve the performance. In this research work, we proposed to study the effect of non-stationary noise and use of over subtraction factor and spectral floor factor as proposed by Berouti. The experimental results indicate that variations implemented results in better performance for helicopter noise and all types of noises at negative SNR's, the spectrogram also indicate that this method does the job similar to that of Boll's proposal.

SPEECH ENHANCEMENT USING SPECTRAL SUBTRACTION TECHNIQUE WITH MINIMIZED CROSS SPECTRAL COMPONENTS

The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several approaches for speech enhancement .earlier approaches didn't consider cross spectral terms into account. Cross spectral terms become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques.

The Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: A Review

International Journal of …, 2011

The spectral subtraction method is a classical approach for enhancement of speech degraded by additive background noise. The basic principle of this method is to estimate the short-time spectral magnitude of speech by subtracting estimated noise spectrum from the noisy speech spectrum. This is also achieved by multiplying the noisy speech spectrum with a gain function and later combining it with the phase of the noisy speech. Besides reducing the background noise, this method introduces an annoying perceptible tonal characteristic in the enhanced speech and affects the human listening, known as remnant musical noise. Several variations and implementations of this method have been adopted in past decades to address the limitations of spectral subtraction method. These variations constitute a family of subtractive-type algorithms and operate in frequency domain. The objective of this paper is to provide an extensive overview of spectral subtractive-type algorithms for enhancement of noisy speech. After the review, this paper is concluded by mentioning a future direction of speech enhancement research from spectral subtraction perspective.

Review of Spectral Subtraction Techniques for Speech Enhancement

2011

Speech enhancement aims to improve speech quality by using various techniques and algorithms. The Spectral subtraction technique is historically one of the first algorithms proposed for removal of additive background noise. It is a single channel speech enhancement technique for the enhancement of speech degraded by additive background noise. Background noise can effect our conversation in a noisy environment like in streets or in a car, when sending speech from the cockpit of an airplane to the ground or to the cabin and can effect both quality and intelligibility of speech. With the passage of time Spectral subtraction has undergone many modifications. This is a review paper and its objective is to provide an overview of the variety of spectral subtraction techniques that have been proposed for enhancement of speech degraded by additive background noise during past decades . Section I gives the Introduction to Speech enhancement and explain basic Spectral Subtraction technique. Se...