Machine Assisted Analysis of Vowel Length Contrasts in Wolof (original) (raw)
Related papers
Vowel quantity contrasts in Twi
15th ICPhS, 2003
This paper reports observations and results obtained in an experiment on short and long vowels in Twi, a tone language spoken in Ghana, West Africa. Two adult Twi speakers produced oral and nasal vowels belonging to the two phonological classes. The overwhelming evidence from our acoustic data is that vowel duration is the determining factor in distinguishing the two classes. Acoustic results further show that short and long vowels are distinguished not only by vowel duration but also by post-vocalic consonant duration: phonologically short vowels are followed by phonetically long consonants. Relative values also make clear and further confirm the robustness of the feature in Twi.
Automatic Speech Recognition for African Languages with Vowel Length Contrast
Procedia Computer Science, 2016
This paper deals with ASR for two languages: Hausa and Wolof. Their common characteristic is to appear with vowel length contrast. In other words, two versions (short / long) of a same vowel exist in the phoneme inventory of the language. We expect that taking into account this contrast in ASR models might help and this is what we investigate in this pilot study. The experimental results show that while both approaches (vowel length contrast modeling or not) lead to similar results, their combination allows to slightly improve ASR performance. As a by-product of ASR system design, we also show that the acoustic models obtained allow a large scale analysis of vowel length contrast for phonetic studies.
Exploiting machine algorithms in vocalic quantification of African English corpora
2019
Towards procedural fidelity in the processing of African English speech corpora, this work demonstrates how the adaptation of machine-assisted segmentation of phonemes and automatic extraction of acoustic values can significantly speed up the processing of naturalistic data and make the vocalic analysis of the varieties less impressionistic. Research in African English phonology has, till date, been least data-driven – much less the use of comparative corpora for cross-varietal assessments. Using over 30 hours of naturalistic data (from 28 speakers in 5 Nigerian cities), the procedures for segmenting audio files into phonemic units via the Munich Automatic Segmentation System (MAUS), and the extraction of their spectral values in Praat are explained. Evidence from the speech corpora supports a more complex vocalic inventory than attested in previous auditory/manual-based accounts – thus reinforcing the resourcefulness of the algorithms for the current data and cognate varieties. Key...
Relationship Between Lexical Tone Contrasts and Vowel Quality
The language under study is Twi, a register tone language of the Kwa group, spoken in Ghana. It has a two tone system (high/low) and the downstep phenomena. The aim of this paper is two-fold. First, some phonological aspects of Twi tones are described. Second, acoustic measurements are carried out to investigate for differences between vowel quality in the two groups. In this analysis, two adult male speakers and one female adult speaker produced a series of isolated words belonging to the two phonological classes. The evidence from our acoustic data, confirming results obtained in a preliminary study, is for high tones to have higher fundamental frequency values than low tones. Acoustic results also confirm that high and low tones show sparse qualitative (F1, F2, F3 and F4) as well as sparse syllabic durational differences. The tendency is also for low tones to have lower F1, higher F2 than the high counterparts.
A study in vowels: Comparing phonetic difficulties between languages
Fòrum de Recerca, 2019
Decade after decade language learning still proves difficult for EFL learners all around the world. Countries are consistently having the same phonetic problems they had. This article wants to present two methods commonly used when comparing languages (contrastive analysis and statistics) and add two more, relatively fresh methods (spectrographic analysis and Johari windows), in order to find a new angle to look at old problems. By taking into account more points of view, the resulting data could be a starting point for future researchers facilitating EFL learning for future generations.
Phonetic correlates of tongue root vowel contrasts in Maa
2004. Journal of Phonetics 32.4: 517-42
Maa, a Nilo-Saharan language, exhibits a cross-height vowel harmony system known as ‘tongue root harmony’. The high and mid vowels participate in this system, but the low vowel does not. The Maa harmony system is briefly described, followed by an investigation into the phonetic properties of the vowels. Five Maa speakers were recorded producing 100 example words three times each. The [+ATR] vowels were found to have consistently lower first formant values and relatively less energy in the higher frequency regions than their [−ATR] counterparts. An investigation of the differences between the auditorily quite similar [−ATR] high and [+ATR] mid vowels revealed durational differences for the back vowels and much inter-speaker variation for the front vowels. Electroglottographic data obtained from one speaker indicated a slightly less constricted glottis for [+ATR] than [−ATR] vowels. This phonation difference is not readily detectable auditorily in the current data, but has been reported previously for Maa. The results contribute to typological knowledge about the phonetics of tongue root vowel contrasts, as very little data is currently available for Nilo-Saharan languages. A possible origin of stronger voice quality distinctions common to other tongue root harmony languages is offered from the theory of Auditory Enhancement.
Speaker accent influences the accuracy of automatic speech recognition (ASR) systems. Knowledge of accent-based acoustic variations can therefore be used in the development of more robust systems. This paper investigates the differences between first language (L1) and second language (L2) English in South Africa and is specifically aimed at L2 English speakers with a native African mother tongue for instance Xhosa, Zulu or South Sotho. The vowel systems of English, and African languages, as described in the linguistic literature, were compared to predict the expected deviations of L2 South African English from the L1 norm. A total of fifty context dependent phonemes from L1 and L2 speakers were acoustically compared and analysed in both formant and mel-scaled cepstral domains. The measured variations compared favourably to those linguistically predicted. The long term goal of this project is to aid in the adaptation of existing L1 English recognition systems for South African L2 English.
2005
Speaker accent influences the accuracy of automatic speech recognition (ASR) systems. Knowledge of accent-based acoustic variations can therefore be used in the development of more robust systems. This paper investigates the differences between first language (L1) and second language (L2) English in South Africa and is specifically aimed at L2 English speakers with a native African mother tongue for instance Xhosa, Zulu or South Sotho. The vowel systems of English, and African languages, as described in the linguistic literature, were compared to predict the expected deviations of L2 South African English from the L1 norm. A total of fifty context dependent phonemes from L1 and L2 speakers were acoustically compared and analysed in both formant and mel-scaled cepstral domains. The measured variations compared favourably to those linguistically predicted. The long term goal of this project is to aid in the adaptation of existing L1 English recognition systems for South African L2 English.
Length Contrast and Covarying Features: Whistled Speech as a Case Study
Interspeech 2018
The status of covarying features to sound contrasts is a longstanding issue in speech: are they deliberately controlled by the speakers, or are they contingent automatic effects required by the defining features? We address this question by drawing parallels between the way gemination is implemented in spoken language and the way it is rendered in whistled speech. Audio materials were collected with five Berber whistlers in Morocco. The spoken and whistled data were composed of pairs of words contrasting singletons to geminates in different word positions. Compared to spoken forms, whistling, while adapting to the specific constraints imposed by the medium, transposes the basic strategies used in normal speech. As in normal speech, the primary and most salient acoustic attribute differentiating whistled singletons and geminates is closure duration. But duration is not used alone. Covarying secondary attributes are conveyed which may serve to enhance the primary correlate by contributing additional properties increasing the distance between the two lexical categories. These enhancing correlates may take on distinctive function in cases where the primary correlate is not implemented. This is, for instance, the case of higher frequency values in word-initial position where duration differences cannot be acoustically implemented using whistled speech.