Jon Boley - Academia.edu (original) (raw)
Papers by Jon Boley
The Neurophysiological Bases of Auditory Perception, 2010
... Michael G. Heinz, Jayaganesh Swaminathan, Jonathan D. Boley, and Sushrut Kale ... Proc Natl A... more ... Michael G. Heinz, Jayaganesh Swaminathan, Jonathan D. Boley, and Sushrut Kale ... Proc Natl Acad Sci USA 103:1886618869 Lorenzi C, Debruille L, Garnier S, Fleuriot P, Moore BC (2009) Abnormal processing of temporal fine structure in speech for frequencies where ...
... may determine the optimal time constants. Acknowledgements Thanks to Michael Stone and Brian ... more ... may determine the optimal time constants. Acknowledgements Thanks to Michael Stone and Brian Moore for supplying their hearing aid algorithm and to Jayaganesh Swaminathan for the SAC/XpAC analysis code. References ... 1 0 1 A m p litu d e Time (sec) ...
A system and method for detecting a non-visual code using an application on a mobile device, wher... more A system and method for detecting a non-visual code using an application on a mobile device, where the application is capable of associating the non-visual code with at least one item contained in a transmitted presentation and connecting the mobile device to information about the item in a database associated with the transmitted presentation. The non-visual code may comprise a high frequency signal played alone or with another audio or video signal. A mobile device application executing on a processor of the mobile device performs signal processing on the audio signal of the presentation to extract the high frequency signal. Also contemplated is obtaining information about the visual content and presenting the information on the personal device.
Hearing aids have been tremendously successful at restoring some hearing abilities for people wit... more Hearing aids have been tremendously successful at restoring some hearing abilities for people with hearing impairments. However, even with some of today‟s most advanced technologies, patients still have an abnormal degree of difficulty understanding speech in the presence of
competing talkers. Hearing scientists believe that listening in complex acoustic environments such as this may require the use of details in the acoustic waveform that might otherwise be
unimportant. Some research suggests that the adaptation rate at which hearing aids actively amplify sounds may determine the amount of acoustic information that gets passed to the
auditory nerve. Research also suggests that the best adaptation rate for a given patient may depend on the underlying physiology of that particular patient. This paper evaluates the importance of this fine structure information in terms of the underlying physiology, hearing aid amplification strategies, and the resulting neural responses. By combining ideas from the behavioral and neurophysiological studies presented here, I propose a study to identify how the amplification speed may be chosen to improve the encoding of fine structure information for patients based on their individual physiological impairments.
The cocktail party effect, our ability to separate a sound source from a multitude of other sourc... more The cocktail party effect, our ability to separate a sound source from a multitude of other sources, has been researched in detail over the past few decades, and many investigators have tried to model this on computers. Two of the major research areas currently being evaluated for the so-called sound source separation problem are Auditory Scene Analysis and a class of statistical analysis techniques known as Independent Component Analysis. This paper presents a methodology for combining these two techniques. It suggests a framework that first separates sounds by analyzing the incoming audio for patterns and synthesizing or filtering them accordingly, measures features of the resulting tracks, and finally separates sounds statistically by matching feature sets and making the output streams statistically independent. Artificial and acoustical mixes of sounds are used to evaluate the signal-to-noise ratio where the signal is the desired source and the noise is comprised of all other sources. The proposed system is found to successfully separate audio streams. The amount of separation is inversely proportional to the amount of reverberation present.
Hearing aids are able to restore some hearing abilities for people with auditory impairments, but... more Hearing aids are able to restore some hearing abilities for people with auditory impairments, but background noise remains a significant problem. Unfortunately, we know very little about how speech is encoded in the auditory system, particularly in impaired systems with prosthetic amplifiers. There is growing evidence that relative timing in the neural signals (known as spatiotemporal coding) is important for speech perception, but there is little research that relates spatiotemporal coding and hearing aid amplification.
This research used a combination of computational modeling and neurophysiological experiments to characterize how hearing aids affect vowel coding in noise at the level of the auditory nerve. The results indicate that sensorineural hearing impairment degrades the temporal cues transmitted from the ear to the brain. Two hearing aid strategies (linear gain and wide dynamic-range compression) were used to amplify the acoustic signal. Although appropriate gain was shown to improve temporal coding for individual auditory nerve fibers, neither strategy improved spatiotemporal cues.
Previous work has attempted to correct the relative timing by adding frequency-dependent delays to the acoustic signal (e.g., within a hearing aid). We show that, although this strategy can affect the timing of individual auditory nerve responses, there is a fundamental limitation in the ability of this approach to improve the relative across-fiber timing (spatiotemporal coding) as intended.
We have shown that existing hearing aid technologies do not improve some of the neural cues that we think are important for perception, but it is important to understand these limitations. Our hope is that this knowledge can be used to develop new technologies to improve auditory perception in difficult acoustic environments.
There is a consensus among many in the audio industry that recorded music has grown increasingly ... more There is a consensus among many in the audio industry that recorded music has grown increasingly
compressed over the past few decades. Some industry professionals are concerned that this
compression often results in poor audio quality with little dynamic range. Although some algorithms have
been proposed for calculating dynamic range, we have not been able to find any studies suggesting that
any of these metrics accurately represent any perceptual dimension of the measured sound. In this
paper, we review the various proposed algorithms and compare their results with the results of a listening
test. We show that none of the tested metrics accurately predict the perceived dynamic range of a
musical track, but we identify some potential directions for future work.
As the audio, video and related industries work toward establishing standards for subjective meas... more As the audio, video and related industries work toward establishing standards for subjective measures of audio/video
quality, more information is needed to understand subjective audio/video interactions. This paper reports a
contribution to this effort that aims to extend previous studies, which show that audio and video quality influence
each other and that some audio artifacts affect overall quality more than others. In the current study, these findings
are combined in a new experiment designed to reveal how individual impairments of audio affect perceived video
quality. Our results show that some audio artifacts enhance the ability to identify video artifacts, while others make
discrimination more difficult.
ABX tests have been around for decades and provide a simple, intuitive means to determine if ther... more ABX tests have been around for decades and provide a simple, intuitive means to determine if there is
an audible difference between two audio signals. Unfortunately, however, the results of proper statistical
analyses are rarely published along with the results of the ABX test. The interpretation of the results may
critically depend on a proper statistical analysis. In this paper, a very successful analysis method known
as signal detection theory is presented in a way that is easy to apply to ABX tests. This method is
contrasted with other statistical techniques to demonstrate the benefits of this approach
A subjective listening test was conducted to determine how objectionable various amounts of laten... more A subjective listening test was conducted to determine how objectionable various amounts of latency are for
performers in live monitoring scenarios. Several popular instruments were used and the results of tests with wedge
monitors are compared to those with in-ear monitors. It is shown that the audibility of latency is dependent on both
the type of instrument and monitoring environment. This experiment shows that the acceptable amount of latency
can range from 42ms to possibly less than 1.4ms under certain conditions. The differences in latency perception for
each instrument are discussed. It is also shown that more latency is generally acceptable for wedge monitoring
setups than for in-ear monitors.
Two of the principle research areas currently being evaluated for the so-called sound source sepa... more Two of the principle research areas currently being evaluated for the so-called sound source separation problem are
Auditory Scene Analysis and a class of statistical analysis techniques known as Independent Component Analysis.
This paper presents a methodology for combining these two techniques. It suggests a framework that first separates
sounds by analyzing the incoming audio for patterns and synthesizing or filtering them accordingly. It then measures
features of the resulting tracks and separates the sounds statistically by matching feature sets and attempting to make
the output streams statistically independent. The proposed system is found to successfully separate artificial and
acoustic mixes of sounds. As expected, the amount of separation is inversely proportional to the amount of
reverberation present, number of sources, and interchannel correlation.
The growth of the commercial spaceflight industry in recent years led to the April 2010 birth of ... more The growth of the commercial spaceflight industry in recent years led to the April 2010 birth of Astronauts4Hire (A4H), a nonprofit organization aimed at developing the market and supply of commercial astronaut candidates. To date, A4H is comprised of 22 carefully selected Flight Members (the "Astronauts4Hire") with a total of 30 advanced degrees, 13 SCUBA divers, 11 pilots, and 15 Zero-or High-G trained. A4H's first for-hire job was completed in February 2011 in a Zero-G environment, and first official training session was completed in July 2011 which included sea survival, emergency egress, centrifuge, altitude chamber, and spatial disorientation training. The first class of A4H Research Specialist astronauts will be fully trained for parabolic and suborbital flight by early 2012.
The Neurophysiological Bases of Auditory Perception, 2010
... Michael G. Heinz, Jayaganesh Swaminathan, Jonathan D. Boley, and Sushrut Kale ... Proc Natl A... more ... Michael G. Heinz, Jayaganesh Swaminathan, Jonathan D. Boley, and Sushrut Kale ... Proc Natl Acad Sci USA 103:1886618869 Lorenzi C, Debruille L, Garnier S, Fleuriot P, Moore BC (2009) Abnormal processing of temporal fine structure in speech for frequencies where ...
... may determine the optimal time constants. Acknowledgements Thanks to Michael Stone and Brian ... more ... may determine the optimal time constants. Acknowledgements Thanks to Michael Stone and Brian Moore for supplying their hearing aid algorithm and to Jayaganesh Swaminathan for the SAC/XpAC analysis code. References ... 1 0 1 A m p litu d e Time (sec) ...
A system and method for detecting a non-visual code using an application on a mobile device, wher... more A system and method for detecting a non-visual code using an application on a mobile device, where the application is capable of associating the non-visual code with at least one item contained in a transmitted presentation and connecting the mobile device to information about the item in a database associated with the transmitted presentation. The non-visual code may comprise a high frequency signal played alone or with another audio or video signal. A mobile device application executing on a processor of the mobile device performs signal processing on the audio signal of the presentation to extract the high frequency signal. Also contemplated is obtaining information about the visual content and presenting the information on the personal device.
Hearing aids have been tremendously successful at restoring some hearing abilities for people wit... more Hearing aids have been tremendously successful at restoring some hearing abilities for people with hearing impairments. However, even with some of today‟s most advanced technologies, patients still have an abnormal degree of difficulty understanding speech in the presence of
competing talkers. Hearing scientists believe that listening in complex acoustic environments such as this may require the use of details in the acoustic waveform that might otherwise be
unimportant. Some research suggests that the adaptation rate at which hearing aids actively amplify sounds may determine the amount of acoustic information that gets passed to the
auditory nerve. Research also suggests that the best adaptation rate for a given patient may depend on the underlying physiology of that particular patient. This paper evaluates the importance of this fine structure information in terms of the underlying physiology, hearing aid amplification strategies, and the resulting neural responses. By combining ideas from the behavioral and neurophysiological studies presented here, I propose a study to identify how the amplification speed may be chosen to improve the encoding of fine structure information for patients based on their individual physiological impairments.
The cocktail party effect, our ability to separate a sound source from a multitude of other sourc... more The cocktail party effect, our ability to separate a sound source from a multitude of other sources, has been researched in detail over the past few decades, and many investigators have tried to model this on computers. Two of the major research areas currently being evaluated for the so-called sound source separation problem are Auditory Scene Analysis and a class of statistical analysis techniques known as Independent Component Analysis. This paper presents a methodology for combining these two techniques. It suggests a framework that first separates sounds by analyzing the incoming audio for patterns and synthesizing or filtering them accordingly, measures features of the resulting tracks, and finally separates sounds statistically by matching feature sets and making the output streams statistically independent. Artificial and acoustical mixes of sounds are used to evaluate the signal-to-noise ratio where the signal is the desired source and the noise is comprised of all other sources. The proposed system is found to successfully separate audio streams. The amount of separation is inversely proportional to the amount of reverberation present.
Hearing aids are able to restore some hearing abilities for people with auditory impairments, but... more Hearing aids are able to restore some hearing abilities for people with auditory impairments, but background noise remains a significant problem. Unfortunately, we know very little about how speech is encoded in the auditory system, particularly in impaired systems with prosthetic amplifiers. There is growing evidence that relative timing in the neural signals (known as spatiotemporal coding) is important for speech perception, but there is little research that relates spatiotemporal coding and hearing aid amplification.
This research used a combination of computational modeling and neurophysiological experiments to characterize how hearing aids affect vowel coding in noise at the level of the auditory nerve. The results indicate that sensorineural hearing impairment degrades the temporal cues transmitted from the ear to the brain. Two hearing aid strategies (linear gain and wide dynamic-range compression) were used to amplify the acoustic signal. Although appropriate gain was shown to improve temporal coding for individual auditory nerve fibers, neither strategy improved spatiotemporal cues.
Previous work has attempted to correct the relative timing by adding frequency-dependent delays to the acoustic signal (e.g., within a hearing aid). We show that, although this strategy can affect the timing of individual auditory nerve responses, there is a fundamental limitation in the ability of this approach to improve the relative across-fiber timing (spatiotemporal coding) as intended.
We have shown that existing hearing aid technologies do not improve some of the neural cues that we think are important for perception, but it is important to understand these limitations. Our hope is that this knowledge can be used to develop new technologies to improve auditory perception in difficult acoustic environments.
There is a consensus among many in the audio industry that recorded music has grown increasingly ... more There is a consensus among many in the audio industry that recorded music has grown increasingly
compressed over the past few decades. Some industry professionals are concerned that this
compression often results in poor audio quality with little dynamic range. Although some algorithms have
been proposed for calculating dynamic range, we have not been able to find any studies suggesting that
any of these metrics accurately represent any perceptual dimension of the measured sound. In this
paper, we review the various proposed algorithms and compare their results with the results of a listening
test. We show that none of the tested metrics accurately predict the perceived dynamic range of a
musical track, but we identify some potential directions for future work.
As the audio, video and related industries work toward establishing standards for subjective meas... more As the audio, video and related industries work toward establishing standards for subjective measures of audio/video
quality, more information is needed to understand subjective audio/video interactions. This paper reports a
contribution to this effort that aims to extend previous studies, which show that audio and video quality influence
each other and that some audio artifacts affect overall quality more than others. In the current study, these findings
are combined in a new experiment designed to reveal how individual impairments of audio affect perceived video
quality. Our results show that some audio artifacts enhance the ability to identify video artifacts, while others make
discrimination more difficult.
ABX tests have been around for decades and provide a simple, intuitive means to determine if ther... more ABX tests have been around for decades and provide a simple, intuitive means to determine if there is
an audible difference between two audio signals. Unfortunately, however, the results of proper statistical
analyses are rarely published along with the results of the ABX test. The interpretation of the results may
critically depend on a proper statistical analysis. In this paper, a very successful analysis method known
as signal detection theory is presented in a way that is easy to apply to ABX tests. This method is
contrasted with other statistical techniques to demonstrate the benefits of this approach
A subjective listening test was conducted to determine how objectionable various amounts of laten... more A subjective listening test was conducted to determine how objectionable various amounts of latency are for
performers in live monitoring scenarios. Several popular instruments were used and the results of tests with wedge
monitors are compared to those with in-ear monitors. It is shown that the audibility of latency is dependent on both
the type of instrument and monitoring environment. This experiment shows that the acceptable amount of latency
can range from 42ms to possibly less than 1.4ms under certain conditions. The differences in latency perception for
each instrument are discussed. It is also shown that more latency is generally acceptable for wedge monitoring
setups than for in-ear monitors.
Two of the principle research areas currently being evaluated for the so-called sound source sepa... more Two of the principle research areas currently being evaluated for the so-called sound source separation problem are
Auditory Scene Analysis and a class of statistical analysis techniques known as Independent Component Analysis.
This paper presents a methodology for combining these two techniques. It suggests a framework that first separates
sounds by analyzing the incoming audio for patterns and synthesizing or filtering them accordingly. It then measures
features of the resulting tracks and separates the sounds statistically by matching feature sets and attempting to make
the output streams statistically independent. The proposed system is found to successfully separate artificial and
acoustic mixes of sounds. As expected, the amount of separation is inversely proportional to the amount of
reverberation present, number of sources, and interchannel correlation.
The growth of the commercial spaceflight industry in recent years led to the April 2010 birth of ... more The growth of the commercial spaceflight industry in recent years led to the April 2010 birth of Astronauts4Hire (A4H), a nonprofit organization aimed at developing the market and supply of commercial astronaut candidates. To date, A4H is comprised of 22 carefully selected Flight Members (the "Astronauts4Hire") with a total of 30 advanced degrees, 13 SCUBA divers, 11 pilots, and 15 Zero-or High-G trained. A4H's first for-hire job was completed in February 2011 in a Zero-G environment, and first official training session was completed in July 2011 which included sea survival, emergency egress, centrifuge, altitude chamber, and spatial disorientation training. The first class of A4H Research Specialist astronauts will be fully trained for parabolic and suborbital flight by early 2012.