Perceptual segregation of tones embedded in modulated noise maskers (original) (raw)

Effects of time intervals and tone durations on auditory stream segregation

Perception & Psychophysics, 2000

Adult listeners rated the difficulty of hearing a single coherent stream in a sequence of high (H) and low (L) tones that alternated in a repetitive galloping pattern (HLH-HLH-HLH ...). They could hear the gallop when the sequence was perceived as a single stream, but when it segregated into two substreams, they heard H-H-... in one stream and L-L-... in the other. The onset-to-onset time of the tones, their duration, the interstimulus interval (lSI) between tones of the same frequency, and the frequency separation between H and L tones were varied. Subjects' ratings on a 7-point scale showed that the well-known effect of speed's increasing stream segregation is primarily due to its effect on the lSI between tones in the same frequency region. This has implications for several theories of streaming. When a sequence of tones, alternating between two frequency ranges, is speeded up, the tendency for the high and low tones to form separate auditory streams is increased. It has been proposed by Bregman (1990) that tones group by their proximity on a frequency-by-time surface. An increase of speed brings the tones closer together in time but does not reduce their frequency separations. This brings the consecutive tones of the same frequency closer together on the frequency-by-time surface, while leaving those of different frequencies almost as far away as they were before. This new proximity favors the grouping of a tone with the next one in the same frequency range even if the two tones are not consecutive, because the alternative grouping (with the tone that comes right after it but is ofa different frequency) requires grouping across a longer distance. So we see that temporal distance is very important. But what is the best way to measure temporal distance? The effect of speed could be due to a change in any of the four types of time intervals shown in Figure 1, which all become shorter when the speed is increased. (Note: SOA means stimulus onset asychrony-i.e., onset-to-onset time, and lSI is the label for interstimulus interval-offset-to-onset time. (1) SOA for consecutive tones in the same frequency range (SOAwithin). Note that in Figure 1 there are two different intervals of this type, one for each frequency, sincethe low tones occur less frequently than the high ones in the galloping pattern. (2) lSI for consecutive tones in the same frequency range (lSI-within). Again, there are two different intervals of this type, since the low tones occur less Support from the Natural Sciences and Engineering Research Council of Canada (Experiment I) and NIMH (Experiment 2) is gratefully acknowledged. We are also grateful for Lisa Weaver's assistance. Correspondence concerning this article should be addressed to A.

Perceptual organization of complex auditory sequences: Effect of number of simultaneous subsequences and frequency separation

Journal of Experimental Psychology-human Perception and Performance, 1999

Previous findings on streaming are generalized to sequences composed of more than 2 subsequences. A new paradigm identified whether listeners perceive complex sequences as a single unit (integrative listening) or segregate them into 2 (or more) perceptual units (stream segregation). Listeners heard 2 complex sequences, each composed of 1, 2, 3, or 4 subsequences. Their task was to detect a temporal irregularity within 1 subsequence. In Experiment 1, the smallest frequency separation under which listeners were able to focus on 1 subsequence was unaffected by the number of co-occurring subsequences; nonfocused sounds were not perceptually organized into streams. In Experiment 2, detection improved progressively, not abruptly, as the frequency separation between subsequences increased from 0.25 to 6 auditory filters. The authors propose a model of perceptual organization of complex auditory sequences.

Rhythmic masking release: Contribution of cues for perceptual organization to the cross-spectral fusion of concurrent narrow-band noises

Journal of The Acoustical Society of America, 2002

The contribution of temporal asynchrony, spatial separation, and frequency separation to the cross-spectral fusion of temporally contiguous brief narrow-band noise bursts was studied using the Rhythmic Masking Release paradigm ͑RMR͒. RMR involves the discrimination of one of two possible rhythms, despite perceptual masking of the rhythm by an irregular sequence of sounds identical to the rhythmic bursts, interleaved among them. The release of the rhythm from masking can be induced by causing the fusion of the irregular interfering sounds with concurrent ''flanking'' sounds situated in different frequency regions. The accuracy and the rated clarity of the identified rhythm in a 2-AFC procedure were employed to estimate the degree of fusion of the interferring sounds with flanking sounds. The results suggest that while synchrony fully fuses short-duration noise bursts across frequency and across space ͑i.e., across ears and loudspeakers͒, an asynchrony of 20-40 ms produces no fusion. Intermediate asynchronies of 10-20 ms produce partial fusion, where the presence of other cues is critical for unambiguous grouping. Though frequency and spatial separation reduced fusion, neither of these manipulations was sufficient to abolish it. For the parameters varied in this study, stimulus onset asynchrony was the dominant cue determining fusion, but there were additive effects of the other cues. Temporal synchrony appears to be critical in determining whether brief sounds with abrupt onsets and offsets are heard as one event or more than one.

The relative role of beats and combination tones in determining the shapes of masking patterns: II. Hearing-impaired listeners

Hearing Research, 2002

Masking patterns were measured for hearing-impaired subjects with varying degrees of hearing loss. In one set of conditions, three subjects were tested using narrowband noise ('noise') and sinusoidal ('tone') maskers and narrowband noise signals. The maskers had centre frequencies of 0.25, 0.5, 1.0 and 4.0 kHz and levels of 60, 80 and 100 dB SPL. Masking patterns for both the noise and tone maskers showed irregularities ('dips'), especially for signal frequencies up to 500 Hz above the masker frequency. The irregularities occurred for all masker levels and for all subjects for at least one masker frequency and they occurred for a relatively constant range of masker-signal frequency separations, suggesting that they were the result of beat detection. In another set of conditions, masking patterns were measured using two subjects, for a 2.0-kHz tone masker with a level of 100 dB SPL and tone and noise signals. For the tone masker alone (baseline condition), the masking patterns again exhibited prominent dips above, and sometimes below, the masker frequency. The addition of a lowpass noise to the masker, intended to mask combination tones, had little effect for one subject. For the other subject, who had near-normal absolute thresholds at low frequencies, the noise elevated thresholds for masker-signal frequency separations between 500 and 1500 Hz. For this subject, an extra tone with a frequency equal to the masker-signal frequency separation, added in place of the lowpass noise, had a very similar effect to that produced by the lowpass noise, suggesting that he was detecting a simple difference tone in the baseline condition. The addition of a pair of high-frequency tones (MDI tones^intended to reduce the detectability of beats) to the masker elevated thresholds for signal frequencies from 1500 to 2500 Hz for one subject and from 1500 to 3500 Hz for another subject. The addition of lowpass noise and MDI tones to the masker produced masking patterns very similar to those observed when the MDI tones alone were added to the masker. Overall, the results suggest that the irregularities in the masking patterns were caused mainly by the detection of beats and not by the detection of combination tones. ß

Rhythmic Masking Release: Effects of Asynchrony, Temporal Overlap, Harmonic Relations, and Source Separation on Cross-Spectral Grouping

Journal of Experimental Psychology-human Perception and Performance, 2005

The rhythm created by spacing a series of brief tones in a regular pattern can be disguised by interleaving identical distractors at irregular intervals. The disguised rhythm can be unmasked if the distractors are allocated to a separate stream from the rhythm by integration with temporally overlapping captors. Listeners identified which of 2 rhythms was presented, and the accuracy and rated clarity of their judgment was used to estimate the fusion of the distractors and captors. The extent of fusion depended primarily on onset asynchrony and degree of temporal overlap. Harmonic relations had some influence, but only an extreme difference in spatial location was effective (dichotic presentation). Both preattentive and attentionally driven processes governed performance.

Rhythmic masking release: Effects of asynchrony, temporal overlap, spectral pattern, and source separation on cross-spectral grouping

The rhythm created by spacing a series of brief tones in a regular pattern can be disguised by interleaving identical distractors at irregular intervals. The disguised rhythm can be unmasked if the distractors are allocated to a separate stream from the rhythm by integration with temporally overlapping captors. Listeners identified which of 2 rhythms was presented, and the accuracy and rated clarity of their judgment was used to estimate the fusion of the distractors and captors. The extent of fusion depended primarily on onset asynchrony and degree of temporal overlap. Harmonic relations had some influence, but only an extreme difference in spatial location was effective (dichotic presentation). Both preattentive and attentionally driven processes governed performance.

The effect of blanking on the identification of temporal order in three-tone sequences

Perception & Psychophysics, 1975

Trained subjects were asked to identify the temporal order of three 2O-msec tones (891, 1,000, and 1,118 Hz), which were immediately followed by a fourth tone. It was found that this added tone, irrelevant to the observer's task, decreased the identifiability of the preceding three-tone pattern, as compared with that of the same pattern in isolation. Such a blanking of the memory of the three-tone sequence was most effective when the frequency of the fourth tone was either identical to that of the first pattern tone or when it lay 1/6-1/3 octave above the highest pattern frequency. The blanking effect was strongest when the duration of the fourth tone was equal to that of the pattern components.

Effect of deviations from temporal expectations on tempo discrimination of isochronous tone sequences

Journal of Experimental Psychology: Human Perception and Performance, 1998

The effect of deviations from temporal expectations on tempo discrimination was studied in 3 experiments using isochronous auditory sequences. Temporal deviations consisted of advancing or delaying the onset of a comparison pattern relative to an "expected" onset, defined by an extension of the periodicity of a preceding standard pattern. An effect of onset condition was most apparent when responses to faster and slower comparison patterns were analyzed separately and onset conditions were mixed. Under these conditions, early onsets produced more "faster" judgments and lower thresholds for tempo increases, and late onsets produced more "slower" judgments and lower thresholds for tempo decreases. In another experiment, pattern tempo had a similar effect: Fast tempos led to lower thresholds for tempo increases and slow tempos led to lower thresholds for tempo decreases. Findings support oscillator-based approaches to time discrimination.

The perceptual segregation of simultaneous auditory signals: Pulse train segregation and vowel segregation

Attention, Perception, & Psychophysics, 1989

In the experiments reported here, we attempted to find out more about how the auditory system is able to separate two simultaneous harmonic sounds. Previous research (Halikia & Bregman, 1984a Scheffers, 1983a) had indicated that a difference in fundamental frequency (F0) between two simultaneous vowel sounds improves their separate identification. In the present experiments, we looked at the effect of F0s that changed as a function of time. In Experiment 1, pairs of unfiltered or filtered pulse trains were used. Some were steady-state, and others had gliding F0s; different F0 separations were also used. The subjects had to indicate whether they had heard one or two sounds. The results showed that increased F0 differences and gliding F0s facilitated the perceptual separation of simultaneous sounds. In Experiments 2 and 3, simultaneous synthesized vowels were used on frequency contours that were steady-state, gliding in parallel (parallel glides), or gliding in opposite directions (crossing glides). The results showed that crossing glides led to significantly better vowel identification than did steady-state F0s. Also, in certain cases, crossing glides were more effective than parallel glides. The superior effect of the crossing glides could be due to the common frequency modulation of the harmonics within each component of the vowel pair and the consequent decorrelation of the harmonics between the two simultaneous vowels.