Improved TDOA disambiguation techniques for sound source localization in reverberant environments (original) (raw)
Related papers
Advanced Disambiguation of TDOA Estimates in The Presence of Reverberation
2010
The Time Difference of Arrival estimate between the signals acquired by pairs of microphones placed in a room with a single sound source performed by maxima search of the Generalized Cross Correlation function is negatively affected by reverberation. Sound reflections in a room generate a great number of non direct source microphone paths each of which translates in an unpredictable number of spurious peaks in the GCC function. This paper illustrates results obtained by improving the performance of disambiguation algorithms with respect to mean processing time and mean estimation errors.
Multi-source localization in reverberant environments
The very large relative bandwidth of acoustic sources, coupled with the high number of reflections of a typical listening room, makes localization a challenging task, since all basic assumptions of classical array processing algorithms constitute at the best viable approximations in real-world environments. In this work, a novel decentralized approach for acoustic localization in reverberant environment is presented. It is based on a two-stage strategy. First, candidate source positions are found by a Time-Delay-Of-Arrivals (TDOA) analysis of signals received by colocated pairs of microphones. Differential delays are estimated by a robust ROOT-MUSIC based technique, applied to the sample cross-spectrum of whitened signals recorded from each microphone pair. A subsequent clustering stage in the spatial coordinates validates the raw TDOA estimates, eliminating most of false detections. The new algorithm is capable of tracking multiple speakers at the same time, exhibits a very good co...
Performance Improvement of TDOA-Based Speaker Localization in Joint Noisy and Reverberant Conditions
EURASIP Journal on Advances in Signal Processing, 2011
TDOA-(time difference of arrival-) based algorithms are common methods for speech source localization. The generalized cross correlation (GCC) method is the most important approach for estimating TDOA between microphone pairs. The performance of this method significantly degrades in the presence of noise and reverberation. This paper addresses the problem of 3D localization in joint noisy and reverberant conditions and a single-speaker scenario. We first propose a modification to make the GCC-PHAse transform (GCC-PHAT) method robust against environment noise. Then, we use an iterative technique that employs location estimation to improve TDOAs accuracy. Extensive experiments on both simulated and real (practical) data (in a single-source scenario) show the capability of the proposed methods to significantly improve TDOA accuracy and, consequently, source location estimates.
Proc. Eur. Signal Processing Conf.(EUSIPCO), Florence, Italy, 2006
The problem of blind separation of multiple acoustic sources has been recently addressed by the TRINICON framework. By exploiting higher order statistics, it allows to successfully separate acoustic sources when propagation takes place in a reverberating environment. In this paper we apply TRINICON to the problem of source localization, emphasizing the fact that it is possible to achieve small localization errors also when source separation is not perfectly obtained. Extensive simulations have been carried out in order ...
Source Localization in Reverberant Environments by Consistent Peak Selection
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
Acoustic source localization in the presence of reverberation is a dif cult task. Conventional approaches, based on time delay estimation performed by generalized cross correlation (GCC) on a set of microphone pairs, followed by geometric triangulation, are often unsatisfactory. Pre ltering is usually adopted to reduce the spurious peaks due to re ections.
Approaches for time difference of arrival estimation in a noisy and reverberant environment
2003
Determining the spatial position of a speaker finds a growing interest in video conference scenario where automated camera steering and tracking are required. As a preliminary step for the localization, microphone array can be used to extract the time difference of arrival (TDOA) of the speech signal. The direction of arrival of the speech signal is then determined by the relative time delay between each, spatially separated, microphone pairs. In this work we present novel, frequency domain, approaches for TDOA calculation in a reverberant and noisy environment. Our methods are based on the speech quasi-stationarity property, and on the fact that the speech and the noise are uncorrelated. The proposed methods are supported by an extensive experimental study.
Time difference of arrival estimation of speech source in a noisy and reverberant environment
Signal Processing, 2005
Determining the spatial position of a speaker finds a growing interest in video conference scenarios where automated camera steering and tracking are required. Speaker localization can be achieved with a dual-step approach. In the preliminary stage a microphone array is used to extract the time difference of arrival (TDOA) of the speech signal. These readings are then used by the second stage for the actual localization. In this work we present novel, frequency domain, approaches for TDOA calculation in a reverberant and noisy environment. Our methods are based on the speech quasistationarity property, noise stationarity and on the fact that the speech and the noise are uncorrelated. The mathematical derivations in this work are followed by an extensive experimental study which involves static and tracking scenarios. r
Signal Processing: An Improved Method For Tdoa-Based Speech Source Localization
TDOA (Time Difference Of Arrival)-based algorithms are common methods for speech source localization. Generalized Cross Correlation (GCC) method is the most important approach for estimating TDOA between microphone pairs. The performance of this method significantly degrades in the presence of noise and reverberation. In this paper, we firstly propose a modification to make GCC-PHAT method robust against environment noise. Then, we use an iterative technique that employs the location estimation to improve TDOAs accuracy. Extensive experiments show the capability of the proposed methods in significant increment of TDOA accuracy, and consequently, more accurate estimations of source location.
Journal of Signal Processing
Source localization techniques are important and effective for various applications. Without reverberation and noise, some conventional source localization techniques have achieved high accuracies. However, reverberation degrades their performance due to reflected sounds because they compare observed time differences of arrival (TDOAs) with theoretical TDOAs, which are derived from anechoic conditions and do not agree with actual ones under reverberant conditions. We propose a template-based method that compensates the discrepancy between theoretical TDOAs and actual TDOAs in order to reduce the influence of reflections. In this study, two types of experiments are performed to validate the effectiveness of the proposed method. The first one is a simple case that investigates the estimation accuracy in detail and the second one investigates a more practical case in a more complicated situation. For both cases, our proposed method can calibrate these errors effectively without increasing the computational time and improve the performance of conventional methods.
EURASIP Journal on Advances in Signal Processing, 2007
Speaker localization with microphone arrays has received significant attention in the past decade as a means for automated speaker tracking of individuals in a closed space for videoconferencing systems, directed speech capture systems, and surveillance systems. Traditional techniques are based on estimating the relative time difference of arrivals (TDOA) between different channels, by utilizing crosscorrelation function. As we show in the context of speaker localization, these estimates yield poor results, due to the joint effect of reverberation and the directivity of sound sources. In this paper, we present a novel method that utilizes a priori acoustic information of the monitored region, which makes it possible to localize directional sound sources by taking the effect of reverberation into account. The proposed method shows significant improvement of performance compared with traditional methods in "noise-free" condition. Further work is required to extend its capabilities to noisy environments.