The role of the auditory and visual modalities in the perceptual identification of Brazilian Portuguese statements and echo questions (original) (raw)
Related papers
Language and Cognition, 2024
This paper presents an audiovisual perceptual analysis of the wh-question and wh-exclamation intonation in Brazilian Portuguese using auditory-visual congruent and incongruent stimuli, to investigate the relative importance of each modality in signaling pragmatic meanings. Ten Brazilian Portuguese speakers (five female) were filmed while producing both speech acts 10 times. Next, artificial stimuli were created: audio and visual cues were either matched (audio and video from the same speech act) or mismatched (audio and video from the different speech acts), resulting in 10 congruent and 10 incongruent stimuli of the wh-questions and the wh-exclamations. The perceptual experiment was taken by 36 Brazilians who identified the stimulus as a question or an exclamation. Results from the logistic regression showed that the factor 'congruence' was significant and had a significant interaction with 'speakers', which means that the congruent stimuli increased the comprehension of the Brazilian Portuguese wh-questions and wh-exclamations. In contrast, the incongruent stimuli tended to lower listeners' identification, but to a degree depending on individual speakers' strategies. Although variation in the accuracy of expressing both speech acts was also found across speakers, this study corroborates that the visual channel impacts the perceptual identification of the pragmatic intonation function of distinguishing sentence mode.
The Role of Visual Stimuli in the Perception of Prosody in Brazilian Portuguese
This study analyzes the role of visual and acoustic stimuli in the recognition of prosodic characteristics as the discrimination of statements and yes-no questions in Brazilian Portuguese (BP). Other Studies such as Massaro (1998), Fagel (2006), Abelin (2007), and Ronquest et al. (2010) claim that speech perception is bimodal. In this study, after performing 3 experiments, it was observed that both modalities (acoustic and visual) play important role in speech perception. The process by which sounds that belong to some language are heard, interpreted, and understood is bimodal.
Audiovisual perception of wh-questions and wh-exclamations in Brazilian Portuguese
International Congress of Phonetic Sciences, 2019
This paper examines auditory and visual cues used for the discrimination of wh-question and wh-exclamative speech acts in Brazilian Portuguese. The sentence Como você sabe was uttered as a wh-question (meaning How do you know?) and as a wh-exclamation (meaning How clever you are!) by ten Brazilian Portuguese speakers (five males and five females) from Rio de Janeiro. The acoustic and visual analyses revealed that these two speech acts not only showed different F0 contours and intensity patterns, but also discriminant facial expressions. A perceptual experiment that investigates the role of visual versus audio channels with three presentation conditions (audio only, video only and audiovisual) was applied with sixty Brazilian participants (twenty per condition). The results indicate that listeners rely on both channels to perceive the wh-questions and wh-exclamations and that the audiovisual condition was more accurately recognized than the monomodal ones.
Visual and auditory cues of assertions and questions in brazilian portuguese and Mexican Spanishy
Journal of Speech Sciences, 2020
The aim of this paper is to compare the multimodal production of questions in two different language varieties: Brazilian Portuguese and Mexican Spanish. Descriptions of the auditory and visual cues of two speech acts, assertions and questions, are presented based on Brazilian and Mexican corpora. The sentence “Como você sabe” was produced as an yes-no (echo) question and an assertion by ten speakers (five male) from Rio de Janeiro and the sentence “Apaga la tele” was produced as a yes-no question and an assertion by five speakers (three male) from Mexico City. The results show that, whereas the Brazilian Portuguese and Mexican Spanish assertions are produced with different F0 contours and different facial expressions, questions in both languages are produced with specific F0 contours but similar facial expressions. The outcome of this comparative study suggests that lowering the eyebrows, tightening the lid and wrinkling the nose can be considered question markers in both language ...
Revista DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
This study presents a perceptual analysis of the Brazilian Portuguese whquestion and wh-exclamation intonational contours to discriminate their acoustic and perceptual features. The corpus of this study is composed of the sentence “Como você sabe” (“How do you know” vs. “How clever you are!”), which was produced with both speech acts. Two perceptual identification experiments were designed to assess the subjects’ ability to identify these speech acts based on their prosodic characteristics, as well as the perceptual relevance of specific prosodic cues in the recognition of wh-questions and wh-exclamations. The results of the first perceptual test indicated that Brazilian listeners can identify these two speech acts by intonation only, whereas the second test showed that F0, duration and intensity cues contribute to the perceptual identification of the speech acts. Stimuli with a falling F0 movement in the last stressed syllable tend to be interpreted as wh-questions, whereas stimuli with a slightly rising F0 movement tend to be judged as wh-exclamations.
Journal of Speech Sciences, 2020
The aim of this paper is to compare the multimodal production of assertions and questions in two different languages: Brazilian Portuguese and Mexican Spanish. Descriptions of the auditory and visual cues of these speech acts are presented based on Brazilian and Mexican corpora. The sentence "Como você sabe" was produced as an assertion and an echo question by ten speakers (five male) from Rio de Janeiro and the sentence "Apaga la tele" was produced as an assertion and a yes-no question by five speakers (three male) from Mexico City. The speech acts intonational patterns were described in terms of F0 movements and annotated in the nuclear region of the contours with ToBI system. Momentary facial muscular changes (namely Action Units) located in the upper and lower part of the face as well as head movements were used to analyze the facial expressions. The acoustic description showed that Brazilian Portuguese assertions are produced with a falling F0 nuclear configuration (H+L*L%) and echo questions with a rising F0 nuclear configuration (L+<H*L%). Mexican Spanish assertions present two types of F0 nuclear configurations, either a low flat nuclear F0 (L*L%) or a falling-rising (L+H*L%) nuclear F0, whereas Mexican Spanish yes-no questions are produced with a low nuclear F0 followed by a rising boundary tone (L*LH%). The outcome of the visual analysis indicates that, whereas Brazilian Portuguese assertions are visually produced with blink and right head tilt and Mexican Spanish assertions with lip stretcher, lowering the eyebrows, tightening the eyelid and wrinkling the nose can be considered question markers in both language varieties.
The present study aimed to investigate how different Voice Onset Time (VOT) patterns are categorized by native speakers of American English and Brazilian Learners of English. American English and Brazilian Portuguese diverge as to the voicing patterns of plosive Revista de Estudos da Linguagem, Belo Horizonte, v. 23, n.2, p. 311-334, 2015 312 consonants, for the VOT cue plays different roles in the distinction between voiced and voiceless consonant categories in each system. This study contrasted four VOT patterns (Negative VOT, Zero VOT, Positive VOT and a manipulated pattern, named Artificial Zero VOT) in two perceptual tasks (AxB discrimination and identification tests), and verified how the two groups of participants categorized these patterns. Results reinforce the idea that speech perception is multimodal and, therefore, the action of multiple cues must be taken into account when we consider phonetic-phonological processes.
LANGUAGE DESIGN. JOURNAL OF THEORETICAL AND EXPERIMENTAL LINGUISTICS, 2008
SP-ToBI (Beckman et al., 2002) has been used to a large extent as a unique and consensual system in many Spanish intonation studies (Kimura 2006, Sahyang, Andruski, Casielles, Nathan & Work 2006, Velázquez 2006, etc.). However, recent studies point out the existence of further pitch accent tonal sequences and alignments than those proposed in SP-ToBI (Prieto & Torreira 2004, Ramírez Verdugo 2005, Toledo, 2006). In fact, Face & Prieto (2006) claim for the need of revision of Beckman’s preliminary system to allow for a wider and more realistic inventory of Spanish pitch accents. In this respect, Face and Prieto survey data on rising accents in Castilian Spanish to propose a three-way contrast rather than the assumed two-way contrast in rising accents. They even challenge the manner in which starredness has been commonly assigned to bitonal accents and propose an analysis based on the secondary association of pitch accent tones. To validate their theory, Face and Prieto (2006) call for experiments which examine speakers’ perception degree of strong or weak saliency regarding rising accents. Such experiments could contribute to decide on whether such variation and contrast corresponds to different phonetic realizations or, on the contrary, it may refer to distinct phonological categories. Taking this background into account, this paper aims to bring some light to this dichotomy. The study explores the findings obtained in an experimental study on Spanish versus British speakers’ perception of interrogative and declarative sentences in AMPER-Madrid Corpus. These research results reveal interesting cross-linguistic differences and similarities that will be explained under the light of previous studies on prosodic cues and theoretical foundations. Keywords: Prosodic perception, interrogatives, declaratives, Spanish versus English cross-linguistic study.
2002
This study was designed to identify English speech contrasts that might be appropriate for the computer-based auditory-visual training of Spanish learners of English. It examines auditory-visual and auditory consonant and vowel confusions by Spanish speaking students of English and a native English control group. 36 Spanish listeners were tested on their identification of 16 consonants and 9 vowels of British English. For consonants, both L2 learners and controls showed significant improvements in the audiovisual condition, with larger effects for syllable final consonants. The patterns of errors by L2 learners were strongly predictable from our knowledge of the relation between the phoneme inventories of Spanish and English. Consonant confusions which were language-dependent – mostly errors in voicing and manner – were not reduced by the addition of visual cues whereas confusions that were common to both listener groups and related to acoustic-phonetic sound characteristics did sho...
Top-down and bottom-up modulation of audiovisual integration in speech
European Journal of Cognitive Psychology, 2005
This research assesses how audiovisual speech integration mechanisms are modulated by sensory and cognitive variables. For this purpose, the McGurk effect ) was used as an experimental paradigm. This effect occurs when participants are exposed to incongruent auditory and visual speech signals. For example, when an auditory /b/ is dubbed onto a visual /g/, listeners are led to perceive a fused phoneme like /d/. With the reverse presentation, they experience a combination such as /bg/. In two experiments, auditory intensity (40 dB, 50 dB, 60 dB, and 70 dB), face size (large : 19 * 23 cm and small: 1.8 * 2 cm) and instructions ("multiple choice"