Towards blind reverberation time estimation for non-speech signals (original) (raw)

Toward blind reverberation time estimation for non-speech signals

The Journal of the Acoustical Society of America, 2013

Reverberation time (RT) is an important parameter for room acoustics characterization, intelligibility and quality assessment of reverberant speech, and for dereverberation. Commonly, RT is estimated from the room impulse response (RIR). In practice, however, RIRs are often unavailable or continuously changing. As such, blind estimation of RT based only on the recorded reverberant signals is of great interest. To date, blind RT estimation has focused on reverberant speech signals. Here, we propose to blindly estimate RT from non-speech signals, such as solo instrument recordings and music ensembles. To estimate the RT of non-speech signals, we propose a blind estimator based on an auditoryinspired modulation spectrum signal representation, which measures the modulation frequency of temporal envelopes computed from a 23channel gammatone filterbank. We show that the higher modulation frequency bands are more sensitive to reverberation than the modulation bands below 20 Hz. When tested on a database of non-speech sounds under 23 different reverberation conditions with reverberation time (T40) ranging from 0.18 to 15.62 s, a blind estimator based on the ratio of high-to-low modulation frequencies outperformed two state-of-the-art methods and achieved correlations with EDT as high as 0.92 for solo instruments and 0.87 for ensembles.

Performance comparison of algorithms for blind reverberation time estimation from speech

2012

The reverberation time, T60, is one of the key parameters used to quantify room acoustics. It can provide information about the quality and intelligibility of speech recorded in a reverberant environment, and it can be used to increase robustness to reverberation of speech processing algorithms. T60 can be determined directly from a measurement of the acoustic impulse response, but in situations where this is unavailable it must be estimated blindly from reverberant speech. In this contribution, we provide a study of three stateof-the-art methods for blind T60 estimation. Experimental results with a large number of talkers, simulated and measured acoustic impulse responses, and various levels of additive white Gaussian noise are presented. The relative merits of the three methods in terms of computational time, estimation accuracy, noise sensitivity and intertalker variance are discussed. In general, all three methods are able to estimate the reverberation time to within 0.2 s for T60 ≤ 0.8 s and SNR ≥ 30 dB, while increasing the noise level causes overestimation. The relative computational speed of the three methods is also assessed.

An improved algorithm for blind reverberation time estimation

… of International Workshop …, 2010

An improved algorithm for the estimation of the reverberation time (RT) from reverberant speech signals is presented. This blind estimation of the RT is based on a simple statistical model for the sound decay such that the RT can be estimated by means of a maximum-likelihood (ML) estimator. The proposed algorithm has a significantly lower computational complexity than previous ML-based algorithms for RT estimation. This is achieved by a downsampling operation and a simple pre-selection of possible sound decays. The new algorithm is more suitable to track time-varying RTs than related approaches. In addition, it can also estimate the RT in the presence of (moderate) background noise.

Blind estimation of reverberation time

The Journal of the Acoustical Society of America, 2003

The reverberation time ͑RT͒ is an important parameter for characterizing the quality of an auditory space. Sounds in reverberant environments are subject to coloration. This affects speech intelligibility and sound localization. Many state-of-the-art audio signal processing algorithms, for example in hearing-aids and telephony, are expected to have the ability to characterize the listening environment, and turn on an appropriate processing strategy accordingly. Thus, a method for characterization of room RT based on passively received microphone signals represents an important enabling technology. Current RT estimators, such as Schroeder's method, depend on a controlled sound source, and thus cannot produce an online, blind RT estimate. Here, a method for estimating RT without prior knowledge of sound sources or room geometry is presented. The diffusive tail of reverberation was modeled as an exponentially damped Gaussian white noise process. The time-constant of the decay, which provided a measure of the RT, was estimated using a maximum-likelihood procedure. The estimates were obtained continuously, and an order-statistics filter was used to extract the most likely RT from the accumulated estimates. The procedure was illustrated for connected speech. Results obtained for simulated and real room data are in good agreement with the real RT values.

Blind reverberation time estimation by intrinsic modeling of reverberant speech

2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013

The reverberation time (RT) is a very important measure that quantifies the acoustic properties of a room and provides information about the quality and intelligibility of speech recorded in that room. Moreover, information about the RT can be used to improve the performance of automatic speech recognition systems and speech dereverberation algorithms. In a recent study, it has been shown that existing methods for blind estimation of the RT are highly sensitive to additive noise. In this paper, a novel method is proposed to blindly estimate the RT based on the decay rate distribution. Firstly, a data-driven representation of the underlying decay rates of several training rooms is obtained via the eigenvalue decomposition of a specially-tailored kernel. Secondly, the representation is extended to a room under test and used to estimate its decay rate (and hence its RT). The presented results show that the proposed method outperforms a competing method and is significantly more robust to noise.

A blind algorithm for reverberation-time estimation using subband decomposition of speech signals

The Journal of the Acoustical Society of America, 2012

An algorithm for blind estimation of reverberation time (RT) in speech signals is proposed. Analysis is restricted to the free-decaying regions of the signal, where the reverberation effect dominates, yielding a more accurate RT estimate at a reduced computational cost. A spectral decomposition is performed on the reverberant signal and partial RT estimates are determined in all signal subbands, providing more data to the statistical-analysis stage of the algorithm, which yields the final RT estimate. Algorithm performance is assessed using two distinct speech databases, achieving 91% and 97% correlation with the RTs measured by a standard nonblind method, indicating that the proposed method blindly estimates the RT in a reliable and consistent manner.

Blind estimation of reverberation time based on the distribution of signal decay rates

2008

The reverberation time is one of the most prominent acoustic characteristics of an enclosure. Its value can be used to predict speech intelligibility, and is used by speech enhancement techniques to suppress reverberation. The reverberation time is usually obtained by analysing the decay rate of i) the energy decay curve that is observed when a noise source is switched off, and ii) the energy decay curve of the room impulse response. Estimating the reverberation time using only the observed reverberant speech signal, i.e., blind estimation, is required for speech evaluation and enhancement techniques. Recently, (semi) blind methods have been developed. Unfortunately, these methods are not very accurate when the source consists of a human speaker, and unnatural speech pauses are required to detect and/or track the decay. In this paper we extract and analyse the decay rate of the energy envelope blindly from the observed reverberation speech signal in the short-time Fourier transform domain. We develop a method to estimate the reverberation time using a property of the distribution of the decay rates. Experimental results using simulated and real reverberant speech signals demonstrate the performance of the new method.

Speech-Model Based Accurate Blind Reverberation Time Estimation Using an LPC Filter

IEEE Transactions on Audio, Speech, and Language Processing, 2012

In this paper, we propose a speech-model based method using the linear predictive (LP) residual of the speech signal and the maximum-likelihood (ML) estimator proposed in "Blind estimation of reverberation time," (R. Ratnam et al., J. Acoust. Soc. Amer., 2004) to blindly estimate the reverberation time (RT 60). The input speech is passed through a low order linear predictive coding (LPC) filter to obtain the LP residual signal. It is proven that the unbiased autocorrelation function of this LP residual has the required properties to be used as an input to the ML estimator. It is shown that this method can successfully estimate the reverberation time with less data than existing blind methods. Experiments show that the proposed method can produce better estimates of RT 60 , even in highly reverberant rooms. This is because the entire input speech data is used in the estimation process. The proposed method is not sensitive to the type of input data (voiced, unvoiced), number of gaps, or window length. In addition, evaluation using white Gaussian noise and recorded babble noise shows that it can estimate RT 60 in the presence of (moderate) background noise.

BLIND ESTIMATION OF REVERBERATION TIME IN OCCUPIED ROOMS

2006

A new framework is proposed in this paper to solve the reverberation time (RT) estimation problem in occupied rooms. In this framework, blind source separation (BSS) is combined with an adaptive noise canceller (ANC) to remove the noise from the passively received reverberant speech signal. A polyfit preprocessing step is then used to extract the free decay segments of the speech signal. RT is extracted from these segments with a maximum-likelihood (ML) based method. An easy, fast and consistent method to calculate the RT via the ML estimation method is also described. This framework provides a novel method for blind RT estimation with robustness to ambient noises within an occupied room and extends the ML method for RT estimation from noise-free cases to more realistic situations. Simulation results show that the proposed framework can provide a good estimation of RT in simulated low RT occupied rooms.

Noise-robust blind reverberation time estimation using noise-aware time–frequency masking

Measurement, 2022

The reverberation time is one of the most important parameters used to characterize the acoustic property of an enclosure. In real-world scenarios, it is much more convenient to estimate the reverberation time blindly from recorded speech compared to the traditional acoustic measurement techniques using professional measurement instruments. However, the recorded speech is often corrupted by noise, which has a detrimental effect on the estimation accuracy of the reverberation time. To address this issue, this paper proposes a two-stage blind reverberation time estimation method based on noise-aware time-frequency masking. This proposed method has a good ability to distinguish the reverberation tails from the noise, thus improving the estimation accuracy of reverberation time in noisy scenarios. The simulated and real-world acoustic experimental results show that the proposed method significantly outperforms other methods in challenging scenarios.