Intra- and inter-speaker variability of vowel space using three different formant extraction methods (original) (raw)
Related papers
Formant Contours in Czech Vowels: Speaker-discriminating Potential
The usefulness of dynamic formant properties for speaker discrimination was demonstrated on English, mostly exploiting long vowels or diphthongs, characterized both by actual measures along the formant tracks and coefficients from polynomial regression on these tracks. This study applies this paradigm to Czech, using less tightly controlled (more forensically realistic) material and taking into account the specific properties of the Czech vocalic system in which long vowels are much rarer than short ones. When all vowels are pooled together, the best results are achieved for unstressed vowels in asymmetrical CVC contexts. When individual vowels are considered separately, classification rates are best for long [i:] and [a:], but, most importantly, short vowels also show promising results. The performance of actual formant values and regression coefficients as predictors in discriminant analysis appears comparable.
Vowel formant dispersion as a measure of articulation proficiency
The Journal of the Acoustical Society of America, 2012
The articulatory range of a speaker has previously been estimated by the shape formed by first and second formant measurements of produced vowels. In a majority of the currently available metrics, formant frequency measurements are reduced to a single estimate for a condition, which has adverse consequences for subsequent statistical testing. Other metrics provide estimates of size of vowel articulation changes only, and do not provide a method for studying the direction of the change. This paper proposes an alternative approach. Vowel formant frequencies are re-defined as vectors originating from a defined center point of the vowel space fixed to a basic three-vowel frame. The euclidian length of the vectors, the Vowel Formant Dispersion (VFD), can be compared across conditions for evidence of articulatory expansions or reductions across conditions or speaker groups. Further, the angle component of the vowel vectors allows for analyses of direction of the reduction or expansion. Based on the range of investigations afforded by the VFD metric, and simulation experiments that compare its statistical properties with those of other proposed metrics, it is argued that the VFD procedure offers an enhanced view of vowel articulation change over rival metrics.
FORMANT FREQUENCIES AS CUES TO VOICING AND PLACE OF ARTICULATION DISTINCTION - AN ACOUSTICAL STUDY
НАУКА И СТВАРНОСТ Зборник радова са научног скупа (Пале, 19. мај 2018), 2019
In this paper we have examined the formant frequencies of the post-fricative vowels in Serbian real words. Results from the two female native speakers point to the conclusion that looking at the frequencies of the first three formants it is possible to differentiate the place of articulation of the fricatives. Also, the result of experiment show that phonological category of [voice] may be distinguished by looking at the formant frequencies of the three formants alone. Tempo of speech distinguished F2 and F3 but not F1. The behavior of the trajectories of formant curves while moving along the temporal dimension was also presented.
The mapping of voice parameters in connected speech of healthy Common Czech male speakers
2019
This study examines a set of voice parameters to map objective ranges of voice-source characteristics of healthy male speakers of Common Czech. Objective assessment of voice quality is conducted mainly in speakers with voice pathologies, typically using sustained vowels as basis for measurements. In our study, we focused on nonpathological voices and performed acoustic measruments of the voice parameters which are believed to reflect glottal characteritics. The analyses were based on the open vowels [a a:] extracted from fifty healthy male speakers who performed a reading task. Voice parameter estimation included f0 perturbation measures (jitter and shimmer), harmonicity (HNR), Cepstral Peak Prominence (CPP), and harmonic amplitude measures which reflect shortterm spectral slope (e.g., H1−H2, H2−H4, or H1−A3). The obtained data relate to connected speech and are compared to the measurements on sustained vowels.
Formants frequencies variability in French vowels under the effect of various speaking styles
Le Journal de …, 1994
The paper presents first results of a research aiming at a more refined definition of the so-called speaking styles. A speaker is recorded in 6 situations and the formants of his lil, /a/ and Iul vowels are studied. New categorizations of the speaking styles are proposed, on the basis of the similarities and dissimilarities in formant frequencies values drawn from the various communication situations.
We evaluate a vowel formant normalisation technique that allows direct visual and statistical comparison of vowel triangles for multiple speakers of different sexes, by calculating for each speaker a 'centre of gravity' S in the F 1 ~ F 2 plane. S is calculated on the basis of formant frequency measurements taken for the so-called 'point' vowel [i], the average F 1 and F 2 for the vowel category with the highest average F 1 (for English, usually the vowel of the TRAP or START lexical sets), and hypothetical minimal F 1 and F 2 values (coordinates we label [uÈ]) extrapolated from the other two points. Expression of individual F 1 and F 2 measurements as ratios of the value of S for that formant permits direct mapping of different speakers' vowel triangles onto one another, resulting in marked improvements in agreement in vowel triangle (a) area and (b) overlap, as compared to similar mappings attempted using linear Hz scales and the z (Bark) scale. 1 We are grateful to the following people for their input, comments and other feedback: -Smith, and an anonymous reviewer. Nelson, D. (ed.) Leeds Working Papers in Linguistics and Phonetics 9 (2002), pp. 159-173.
Vowel space density as an indicator of speech performance
The Journal of the Acoustical Society of America, 2017
The purpose of this study was to develop a method for visualizing and assessing the characteristics of vowel production by measuring the local density of normalized F1 and F2 formant frequencies. The result is a three-dimensional plot called the vowel space density (VSD) and indicates the regions in the vowel space most heavily used by a talker during speech production. The area of a convex hull enclosing the vowel space at specific threshold density values was proposed as a means of quantifying the VSD.
Human Language Technology. Challenges for Computer Science and Linguistics, 2020
In this paper, we discuss the results of the analysis of F1 and F2 frequency measurements in Polish nasalized vowels represented in writing by the graphemes ę and ą (realized before voiceless fricatives). The speech material included recordings of isolated word items provided by 20 adult native speakers of Polish (10 females and 10 males). According to the claims often presented in phonetic studies, the two vowels are phonetically realized as diphthongs composed of two subsequent stages of realization: an oral and a nasal stage. In our investigation, we refer to the results obtained by Lorenc et al. (cf. [13] and [14]) based on the analyses of spatial distribution of the acoustic field which indicate that the structure might be even more complex in certain cases and include three or even more stages. We measure formant frequencies within these stages using the stage timestamps obtained with a novel infrastructure composed of a multi-channel recorder with a circular microphone array. Among others, the results indicate that the two vowels differ significantly with regard to their internal structures as expressed by the number and types of the stages as well as frequency formant characteristics of those stages.
… of the 16th International Congress of …, 2007
Eight languages (Arabic, English, French, German, Italian, Mandarin Chinese, Portuguese, Spanish) with 6 differently sized vowel inventories were analysed in terms of vowel formants. A tendency to phonetic reduction for vowels of short acoustic durations clearly emerges for all languages. The data did not provide evidence for an effect of inventory size on the global acoustic space and only the acoustic stability of quantal vowel /i/ is greater than that of other vowels in many cases.
The Journal of the Acoustical Society of America, 2013
The purpose of this study is to investigate the potential relationship between speaking fundamental frequency and acoustic vowel space size, thus testing a possible perceptual source of sex-specific differences in acoustic vowel space size based on the greater inter-harmonic spacing and a poorer definition of the spectral envelope of higher pitched voices. Average fundamental frequencies and acoustic vowel spaces of 56 female German speakers are analyzed. Several parameters are used to quantify the size and shape of the vowel space defined by /i+ e a+ O u+/ such as the area of the polygon spanned by the five vowels, the absolute difference in F1 or F2 between /i+/ and /u+/ or /a+/, and the Euclidian distance between /i+/ and /a+/. In addition, the potential impact of nasality on the vowel space size is examined. Results reveal no significant correlation between fundamental frequency and vowel space size suggesting other factors must be responsible for the larger female acoustic vowel space. V