The correspondences between the perception of the speaker individualities contained in speech sounds and their acoustic properties (original) (raw)

This study investigates the correspondences between the differences among the phones in human speaker identification and their acoustic properties. In the speaker identification test, the Japanese CV syllables excerpted from the carrier sentences were used as the stimuli. As pointed out in the previous studies, the stimuli containing the nasal sounds were significantly effective for the identification of the speakers, compared to other stimuli containing only the oral sounds. In the acoustic analyses, we analysed the spectral properties of the stimuli in order to explain these differences in the perception test, and we found that the cepstral distances among the speakers were significantly larger in the nasal sounds than in the oral sounds. Also, there were correspondences between the rankings of the consonants in the identification test and in the cepstral distances.

Sign up to get access to over 50M papers

Sign up for access to the world's latest research

Speaker Similarities in Human Perception and Their Spectral Properties

2006

The acoustic properties of voiced sonorants are said to be speaker-dependent, and it is also reported that these sounds are effective for identifying the speakers. In our previous experiments, we found that the nasal sounds are highly effective for speaker identification by listening, and that the inter-speaker distances in the spectra were also greater in nasals than in oral sounds. This present study further analyses the spectral properties of nasals and orals in terms of perceptual and acoustical voice similarity and shows that nasal sounds may have longer interval that listeners can exploit for speaker identification.

Effects of linguistic contents on perceptual speaker identification: Comparison of familiar and unknown speaker identifications

Acoustical Science and Technology, 2009

There are several factors that affect human speaker recognition. In this study, two experiments were conducted in order to see the effects that the stimulus contents and the familiarity to the speakers give to the perception of the speakers. The results showed that: a) stimuli including a nasal were effective for accurate speaker identification; b) coronal nasals were more effective than the labial nasal, and c) the familiarity to the speakers gives a great influence on the performance. The tendencies a) and b) were observed both in familiar and unknown speaker identifications. The results of the acoustical analyses also showed that there were correspondences between the perception of the speaker identity and the cepstral distances among the speakers. The inter-speaker cepstral distances were greater in vowel intervals than in the consonant intervals; especially, notably they were greater in nasals than in orals in the consonant intervals.

Contribution of Consonants and Vowels to the Perception of Speaker Identity

2015

In many studies, it is reported that stimulus contents affect the perceptual speaker identification. In our previous research, we showed that nasal sounds are effective for accurate speaker identification. In this present study, we investigate the contributions of the syllable onset consonants (C) and the syllable nucleus (V) to the perception of the speaker identity by using the hybrid CV monosyllables where C and V are uttered by two different speakers. The results showed that perceived speaker of the hybrid CV syllables was inclined to be the speaker of V, and this tendency was prominent especially with the stimuli containing nasal consonants. This suggests that vowels mainly convey speaker individuality, and nasalised vowels contain more speaker information than oral vowels.

Perceptual speaker identification using monosyllabic stimuli - effects of the nucleus vowels and speaker characteristics contained in nasals

2008

The goal of our research is to find out the acoustical correlates of human perception of speaker identity. In this study we investigated the effects of the stimulus contents on perceptual speaker identification. Forty-eight monosyllables were used as the stimuli for identifying four male speakers. The results showed that the syllables containing a coronal nasal yielded higher identification accuracies than the syllables without it, and the syllables with a back vowel gained significantly better scores than those with a front vowel. We also found speaker-dependent characteristics in the velar movements in articulation of nasal consonants.

Effects of vowel types on perception of speaker characteristics of unknown speakers

The aim of this study is to explore possible speaker characteristics common to speech sounds, through two psychoacoustic experiments. In the first experiment, sustained Japanese vowels produced by four adult male speakers were used, and ABX tests were carried out to confirm whether speaker individualities common to sustained vowels exist by testing whether participants can verify unknown speakers using speaker characteristics obtained from other vowels. The experimental results show that there are some speaker characteristics common to the vowels and that pitch frequency is one of the primary cues for identification of unknown speakers of the sustained vowels. In the second experiment, ABX tests were conducted using three Japanese sentences produced by four adult male speakers. The results indicate that the participants can identify the speakers even though the mean of the pitch frequency is speaker-normalized, implying that dynamic properties of speech are important for auditory-perceptual speaker identification.

The Prototype Model in Speaker Identification by Human Listeners

International Journal of Speech Technology, 2001

Little is known about the perceptual processes of speaker identification and their relationship to the acoustic features of the speaker's voice. A study of speaker perception and identification by psychoacoustic experiments was carried out. Twenty male speakers were recorded and thirty listeners participated in the experiments. Statistical analysis of the results suggests that the prototype model is appropriate for explaining

Talker identification based on phonetic information

Journal of Experimental Psychology: Human Perception and Performance, 1997

Accounts of the identification of words and talkers commonly rely on different acoustic properties. To identify a word, a perceiver discards acoustic aspects of an utterance that are talker specific, forming an abstract representation of the linguistic message with which to probe a mental lexicon. To identify a talker, a perceiver discards acoustic aspects of an utterance specific to particular phonemes, creating a representation of voice quality with which to search for familiar talkers in long-term memory. In 3 experiments, sinewave replicas of natural speech sampled from 10 talkers eliminated natural voice quality while preserving idiosyncratic phonetic variation. Listeners identified the smewave talkers without recourse to acoustic attributes of natural voice quality. This finding supports a revised description of speech perception in which the phonetic properties of utterances serve to identify both words and talkers.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.