Subspace based Speech Enhancement using Common Vector Approach (original) (raw)
Related papers
A new speech signal denoising algorithm using common vector approach
International Journal of Speech Technology, 2018
Speech denoising may improve intelligibility of speech and hearing comfort in voice communication/recognition applications in noisy environments. It can also be used to enhance old recordings. Most speech enhancement methods are intrusive and cause some loss in the signal component while removing noise. In this paper, we propose a method based on common vector approach (CVA) for reducing losses in single-channel enhancement algorithms. In the proposed technique, overlapping speech sample frames are collected in classes according to their similarity and common and difference vectors of the classes are separated using CVA. Since the noise component is uncorrelated and therefore presumably concentrated in the difference part, difference vectors are denoised using a common denoising technique and sample frames are reconstructed by combining the common and the denoised difference parts. This operation does not affect the common vector and somewhat secures improvement even for highly noised data. Compared to the state-of-the-art, highly promising results are obtained in terms of several speech quality measures.
The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a comprehensive study of the potential of subspace filtering to increase the robustness of automatic speech recognisers against stationary additive noise distortions. Subspace filtering methods are based on the orthogonal decomposition of the noisy speech observation space into a signal subspace and a noise subspace. This decomposition is possible under the assumption of a low-rank model for speech, and on the availability of an estimate of the noise correlation matrix. We present an extensive overview of the available estimators, and derive a theoretical estimator to experimentally assess an upper bound to the performance that can be achieved by any subspace-based method. Automatic speech recognition (ASR) experiments with noisy data demonstrate that subspace-based speech enhancement can significantly increase the robustness of these systems in additive coloured noise environments. Optimal performance is obtained only if no explicit rank reduction of the noisy Hankel matrix is performed. Although this strategy might increase the level of the residual noise, it reduces the risk of removing essential signal information for the recogniser's back end. Finally, it is also shown that subspace filtering compares favourably to the well-known spectral subtraction technique.
Speech Denoising using Common Vector Analysis in Frequency Domain
International Journal of Applied Mathematics, Electronics and Computers, 2016
Signal denoising approaches on data of any dimension largely relies on the assumption that data and the noise components and the noise itself are somewhat uncorrelated. However, any denoising process heavily depending on this assumption retreats when the signal component is adversely affected by the operation. Therefore, several proposed algorithms try to separate the data into two or more parts with varying noise levels so that denoising process can be applied on them with different parameters and constraints. In this paper, the proposed method separates the speech data into magnitude and phase where the magnitude part is further separated into common and difference parts using common vector analysis. It is assumed that the noise largely resides on difference part and therefore denoised by a known algorithm. The speech data is reconstructed by combining common, difference and phase parts. Using Linear Minimum Mean Square Error Estimation algorithm on the difference part, excellent denoising results are obtained. Results are compared with that of the state of the art in well-known speech quality measures.
Speech enhancement using pca and variance of the reconstruction error model identification
… Annual Conference of …, 2007
We present in this paper a signal subspace-based approach for enhancing a noisy signal. This algorithm is based on a principal component analysis (PCA) in which the optimal subspace selection is provided by a variance of the reconstruction error (VRE) criterion. This choice overcomes many limitations encountered with other selection criteria, like overestimation of the signal subspace or the need for empirical parameters. We have also extended our subspace algorithm to take into account the case of colored and babble noise. The performance evaluation, which is made on the Aurora database, measures improvements in the distributed speech recognition of noisy signals corrupted by different types of additive noises. Our algorithm succeeds in improving the recognition of noisy speech in all noisy conditions.
Perceptual subspace speech enhancement using variance of the reconstruction error
Digital Signal Processing, 2014
In this paper, a new signal subspace-based approach for enhancing a speech signal degraded by environmental noise is presented. The Perceptual Karhunen-Loève Transform (PKLT) method is improved here by including the Variance of the Reconstruction Error (VRE) criterion, in order to optimize the subspace decomposition model. The incorporation of the VRE in the PKLT (namely the PKLT-VRE hybrid method) yields a good tradeoff between the noise reduction and the speech distortion thanks to the combination of a perceptual criterion and the optimal determination of the noisy subspace dimension. In adverse conditions, the experimental tests, using objective quality measures, show that the proposed method provides a higher noise reduction and a lower signal distortion than the existing speech enhancement techniques.
Speech Enhancement using Signal Subspace Algorithm
In speech communication, quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. The speech processing systems used to communicate or store speech are usually designed for a noise free environment but in a real-world environment, the presence of background interference in the form of additive background noise and channel noise drastically degrades the performance of these systems, causing inaccurate information exchange and listener fatigue. Speech enhancement algorithms attempt to improve the performance of communication systems when their input or output signals are corrupted by noise. Speech Enhancement in general has three major objectives: (a) To improve the perceptual aspects such as quality and intelligibility of the processed speech i.e. to make it sound better or clearer to the human listener; (b) to improve the robustness of the speech coders which tend to be severely affected by presence of noise; and (c) to increase the accuracy of speech recognition systems operating in less than ideal locations.
Improved subspace based speech enhancement using an adaptive time segmentation
2005
Subspace based speech enhancement relies on the decomposition of the vector space spanned by the covariance matrix of noisy speech into a noise subspace and a signal subspace, where the noise subspace is nulled and the signal subspace is modified by applying a gain function. This gain function is determined by the eigenvalues of the noise and noisy speech covariance matrix that are typically estimated from the noisy data using a fixed segmentation. A fixed segmentation often leads to covariance matrix estimates with an unnecessary high variance or a bias, because segments are shorter or longer, respectively, than the region where the noisy data is stationary. To overcome this problem we present an adaptive time-segmentation algorithm combined with subspace based speech enhancement. As a result, smearing of speech sounds and musical noise in the enhanced speech signal are reduced. Experiments show improvements in terms of segmental SNR of 0.6 dB and symmetrical Itakura-Saito distortion measure over the use of a fixed segmentation.
Quality improvement of low-bit-rate noisy speech using the subspace method
2002
Additive noise presents a particularly difficult problem for linear predictive-based speech coding (LPC) systems operating at sampling rates up to 4.8 kb/s. The performance. of cascade signal-subspace-based speech enhancement algorithm/code excited linear predictive (CELP) coding system is studied and compared with the conventional cascade spectral-subtraction-based speech enhancement algorithm/CELP system. Both systems have been tested under the same conditions, i.e. the same type of noise (white noise) and the same signal-to-noise-ratios. An LPC-based objective test is used for the evaluation of both systems. Based on this test, it is shown that the signal-subspace-based speech enhancement algorithm outperforms the spectral-subtraction-based speech enhancement algorithm such that it could be recommended as a preprocessor for any LPC speech coding system.
Signal Subspace Speech Enhancement for Audible Noise Reduction
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., 2000
A novel subspace-based speech enhancement scheme based on a criterion of audible noise reduction is considered. Masking properties of the human auditory system is used to define the audible noise quantity in the eigen-domain. Subsequently, an audible noise reduction scheme is developed based on a signal subspace technique. We derive the eigendecomposition of the estimated speech autocorrelation matrix with the assumption of white noise and outline the implementation of our proposed scheme. We further extend the scheme to the colored noise case. Simulation results show that our proposed scheme outperforms many existing subspace methods in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ) and informal listening tests.
Single channel speech enhancement using principal component analysis and MDL subspace selection
1999
We present in this paper a novel subspace approach for single channel speech enhancement and speech recognition in highly noisy environments. Our algorithm is based on principal component analysis and the optimal subspace selection is provided by a minimum description length criterion. This choice overcomes the limitations encountered with other selection criteria, like the overestimation of the signal plus noise subspace or the need for empirical parameters. We h a v e also extended our subspace algorithm to take into account the case of colored noise. The performance evaluation shows that our method provides a higher noise reduction and a lower signal distortion than existing enhancement methods and that speech recognition in noise is improved. Our algorithm succeeds in extracting the relevant features of speech even in highly noisy conditions without introducing artefacts such a s m usical noise".