Axel Plinge | TU Dortmund (original) (raw)

Papers by Axel Plinge

Research paper thumbnail of Bag-of-Features Acoustic Event Detection for Sensor Networks

Research paper thumbnail of Multi-microphone Speech Enhancement Informed by Auditory Scene Analysis

A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus ... more A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus our attention to the well-known minimum variance distortionless response (MVDR) beamformer, due to its ability to preserve distortionless response towards the desired speaker while minimizing the output noise power. We explore two alternatives for constructing the steering vectors towards the desired speech source. One is only using the direct path of the speech propagation in the form of delay-only filters, while the other is using the entire room impulse response (RIR). All beamforming methods requires some control information to be able to accomplish the task of enhancing a desired speech signal. In this paper, an acoustic event detection method using biologically-inspired features is employed. It can interpret the auditory scene by detecting the presence of different auditory objects. This is employed to control the estimation procedures used by beamformer. The resulting system provides a blind method of speech enhancement that can improve intelligibility independently of any additional information. Experiments with real recordings show the practical applicability of the method. Significant gain in fwSNRseg is achieved. Compared to using the direct path only, the use of the entire RIR proves beneficial. Index Terms— microphone array, auditory scene analysis, blind beamformer for speech enhancement

Research paper thumbnail of Video: Acoustic Event Detection using Bag-of-Features

Research paper thumbnail of Poster: A Bag-of-Features Approach to Acoustic Event Detection

Research paper thumbnail of Demo Video: Reverberation-Robust Online Multi-Speaker Tracking by using a Microphone Array and CASA Processing

Research paper thumbnail of Poster: Robust Neuro-Fuzzy Speaker Localization Using a Circular Microphone Array

Research paper thumbnail of Video: Geometry Calibration of Distributed Microphone Arrays Exploiting Audio-Visual Correspondences

Research paper thumbnail of Poster: Neurobiologically Inspired ManySpeaker Tracking

Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired rea... more Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transmitted to an integration node where they are associated using their spectra to resolve the ambiguity of multiple simultaneous detections for multiple concurrent speakers. The speakers' Euclidean coordinates are computed by triangulation and tracking is realized by integrating over time using spatial association. The system is designed to be robust against drift, jitter and transmission errors, so that it can be easily realized with wireless connections. Practical applicability is proven with recordings of persons in a laboratory setup in an highly reverberant conference room where concurrent speakers are tracked with good accuracy.

Research paper thumbnail of Multiple speaker tracking using a microphone array by combining auditory processing and a gaussian mixture cardinalized probability hypothesis density filter

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011

Research paper thumbnail of A Bag-of-Features approach to acoustic event detection

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014

Research paper thumbnail of Multi-speaker tracking using multiple distributed microphone arrays

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014

Research paper thumbnail of Geometry calibration of multiple microphone arrays in highly reverberant environments

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), 2014

Research paper thumbnail of Neurobiologically Inspired ManySpeaker Tracking

ABSTRACT Tracking speakers is one of the key tasks in smart environments. A neurobiologically ins... more ABSTRACT Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transmitted to an integration node where they are associated using their spectra to resolve the ambiguity of multiple simultaneous detections for multiple concurrent speakers. The speakers' Euclidean coordinates are computed by triangulation and tracking is realized by integrating over time using spatial association. The system is designed to be robust against drift, jitter and transmission errors, so that it can be easily realized with wireless connections. Practical applicability is proven with recordings of persons in a laboratory setup in an highly reverberant conference room where concurrent speakers are tracked with good accuracy.

Research paper thumbnail of Geometry Calibration of Distributed Microphone Arrays Exploiting Audio-Visual Correspondences

Research paper thumbnail of Online Multi-Speaker Tracking Using Multiple Microphone Arrays Informed by Auditory Scene Analysis

Research paper thumbnail of Spatial orienting of attention in stereo depth

Psychological Research, 2012

The aim of this study was to investigate the spatial orienting of visual attention in depth under... more The aim of this study was to investigate the spatial orienting of visual attention in depth under purely stereoscopic viewing conditions. Random-dot stereograms were used to present disparity-defined target stimuli that were either validly or invalidly cued in depth. In separate tasks, participants responded either to the relative depth of the target (protruding vs. receding) or to its shape (square vs. diamond). Stimulus onset asynchronies (SOAs) between an uninformative exogenous cue and target were varied from 250 to 600 ms. For both tasks, mean response times (RTs) were shorter for validly than invalidly cued target depths and this RT advantage was essentially restricted to the shortest SOA of 250 ms. These results indicate that attention can be reflexively allocated to locations in stereo depth under conditions of low perceptual load, and independent of whether depth is relevant to the task or not.

Research paper thumbnail of Speech Enhancement and Replacement for Severely Hearing Impaired Persons by Digital Processing; Current Status

Abstract. For several years, we have been working on means to improve speech reception for severe... more Abstract. For several years, we have been working on means to improve speech reception for severely sensory hearing-impaired persons. The work done includes algorithms for non-linear speech processing as well as phoneme spotting and transposition. The overall goal is to implement some of these algorithms into low-power DSPs in a wearable device. Here we will present the current state of our research, and some examples of processed speech will be demonstrated. Keywords: hearing impairment, digital processing, transposition, ...

Research paper thumbnail of Telecommunication Adapter For Hearing Aid Wearers-with extensions

Assistive technology: added value to the quality of life, AAATE'01, Oct 1, 2001

Abstract. Modern communication technology-if not adapted to the needs of the severely hearing imp... more Abstract. Modern communication technology-if not adapted to the needs of the severely hearing impaired person-leads to the exclusion from everyday communication. Only if it is well adapted it may offer a higher degree of freedom and integration. A telecommunication adapter was developed and can be used for two purposes: to provide access to mobile phone technology, and-with extension–to high quality access to PSTN phones. At the same time connections to TV, radio and other external sources are made possible.

Research paper thumbnail of Providing Speech Enhancement and Replacement for Persons with Severely Impaired Hearing

Research paper thumbnail of Tools and Strategies for Fitting a Wearable Frication Transposer to the Needs of Severely Hearing Impaired People

Research paper thumbnail of Bag-of-Features Acoustic Event Detection for Sensor Networks

Research paper thumbnail of Multi-microphone Speech Enhancement Informed by Auditory Scene Analysis

A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus ... more A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus our attention to the well-known minimum variance distortionless response (MVDR) beamformer, due to its ability to preserve distortionless response towards the desired speaker while minimizing the output noise power. We explore two alternatives for constructing the steering vectors towards the desired speech source. One is only using the direct path of the speech propagation in the form of delay-only filters, while the other is using the entire room impulse response (RIR). All beamforming methods requires some control information to be able to accomplish the task of enhancing a desired speech signal. In this paper, an acoustic event detection method using biologically-inspired features is employed. It can interpret the auditory scene by detecting the presence of different auditory objects. This is employed to control the estimation procedures used by beamformer. The resulting system provides a blind method of speech enhancement that can improve intelligibility independently of any additional information. Experiments with real recordings show the practical applicability of the method. Significant gain in fwSNRseg is achieved. Compared to using the direct path only, the use of the entire RIR proves beneficial. Index Terms— microphone array, auditory scene analysis, blind beamformer for speech enhancement

Research paper thumbnail of Video: Acoustic Event Detection using Bag-of-Features

Research paper thumbnail of Poster: A Bag-of-Features Approach to Acoustic Event Detection

Research paper thumbnail of Demo Video: Reverberation-Robust Online Multi-Speaker Tracking by using a Microphone Array and CASA Processing

Research paper thumbnail of Poster: Robust Neuro-Fuzzy Speaker Localization Using a Circular Microphone Array

Research paper thumbnail of Video: Geometry Calibration of Distributed Microphone Arrays Exploiting Audio-Visual Correspondences

Research paper thumbnail of Poster: Neurobiologically Inspired ManySpeaker Tracking

Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired rea... more Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transmitted to an integration node where they are associated using their spectra to resolve the ambiguity of multiple simultaneous detections for multiple concurrent speakers. The speakers' Euclidean coordinates are computed by triangulation and tracking is realized by integrating over time using spatial association. The system is designed to be robust against drift, jitter and transmission errors, so that it can be easily realized with wireless connections. Practical applicability is proven with recordings of persons in a laboratory setup in an highly reverberant conference room where concurrent speakers are tracked with good accuracy.

Research paper thumbnail of Multiple speaker tracking using a microphone array by combining auditory processing and a gaussian mixture cardinalized probability hypothesis density filter

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011

Research paper thumbnail of A Bag-of-Features approach to acoustic event detection

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014

Research paper thumbnail of Multi-speaker tracking using multiple distributed microphone arrays

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014

Research paper thumbnail of Geometry calibration of multiple microphone arrays in highly reverberant environments

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), 2014

Research paper thumbnail of Neurobiologically Inspired ManySpeaker Tracking

ABSTRACT Tracking speakers is one of the key tasks in smart environments. A neurobiologically ins... more ABSTRACT Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transmitted to an integration node where they are associated using their spectra to resolve the ambiguity of multiple simultaneous detections for multiple concurrent speakers. The speakers' Euclidean coordinates are computed by triangulation and tracking is realized by integrating over time using spatial association. The system is designed to be robust against drift, jitter and transmission errors, so that it can be easily realized with wireless connections. Practical applicability is proven with recordings of persons in a laboratory setup in an highly reverberant conference room where concurrent speakers are tracked with good accuracy.

Research paper thumbnail of Geometry Calibration of Distributed Microphone Arrays Exploiting Audio-Visual Correspondences

Research paper thumbnail of Online Multi-Speaker Tracking Using Multiple Microphone Arrays Informed by Auditory Scene Analysis

Research paper thumbnail of Spatial orienting of attention in stereo depth

Psychological Research, 2012

The aim of this study was to investigate the spatial orienting of visual attention in depth under... more The aim of this study was to investigate the spatial orienting of visual attention in depth under purely stereoscopic viewing conditions. Random-dot stereograms were used to present disparity-defined target stimuli that were either validly or invalidly cued in depth. In separate tasks, participants responded either to the relative depth of the target (protruding vs. receding) or to its shape (square vs. diamond). Stimulus onset asynchronies (SOAs) between an uninformative exogenous cue and target were varied from 250 to 600 ms. For both tasks, mean response times (RTs) were shorter for validly than invalidly cued target depths and this RT advantage was essentially restricted to the shortest SOA of 250 ms. These results indicate that attention can be reflexively allocated to locations in stereo depth under conditions of low perceptual load, and independent of whether depth is relevant to the task or not.

Research paper thumbnail of Speech Enhancement and Replacement for Severely Hearing Impaired Persons by Digital Processing; Current Status

Abstract. For several years, we have been working on means to improve speech reception for severe... more Abstract. For several years, we have been working on means to improve speech reception for severely sensory hearing-impaired persons. The work done includes algorithms for non-linear speech processing as well as phoneme spotting and transposition. The overall goal is to implement some of these algorithms into low-power DSPs in a wearable device. Here we will present the current state of our research, and some examples of processed speech will be demonstrated. Keywords: hearing impairment, digital processing, transposition, ...

Research paper thumbnail of Telecommunication Adapter For Hearing Aid Wearers-with extensions

Assistive technology: added value to the quality of life, AAATE'01, Oct 1, 2001

Abstract. Modern communication technology-if not adapted to the needs of the severely hearing imp... more Abstract. Modern communication technology-if not adapted to the needs of the severely hearing impaired person-leads to the exclusion from everyday communication. Only if it is well adapted it may offer a higher degree of freedom and integration. A telecommunication adapter was developed and can be used for two purposes: to provide access to mobile phone technology, and-with extension–to high quality access to PSTN phones. At the same time connections to TV, radio and other external sources are made possible.

Research paper thumbnail of Providing Speech Enhancement and Replacement for Persons with Severely Impaired Hearing

Research paper thumbnail of Tools and Strategies for Fitting a Wearable Frication Transposer to the Needs of Severely Hearing Impaired People