Task-dependent modulation of the visual sensory thalamus assists visual-speech recognition (original) (raw)

Neural pathways for visual speech perception

Frontiers in Neuroscience, 2014

This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.

Multisensory and modality specific processing of visual speech in different regions of the premotor cortex

2014

Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex (PMC) has been shown to be active during both observation and execution of action ("Mirror System" properties), and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI) study, participants identified vowels produced by a speaker in audio-visual (saw the speaker's articulating face and heard her voice), visual only (only saw the speaker's articulating face), and audio only (only heard the speaker's voice) conditions with varying audio signal-to-noise ratios in order to determine the regions of the PMC involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the functional magnetic resonance imaging (fMRI) analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and PMC. The left ventral inferior premotor cortex (PMvi) showed properties of multimodal (audio-visual) enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex (PMvs/PMd) did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the PMC are involved with mapping unimodal (in this case visual) sensory features of the speech signal with articulatory speech gestures.

Visual speech speeds up the neural processing of auditory speech

Proceedings of the National Academy of Sciences, 2005

Synchronous presentation of stimuli to the auditory and visual systems can modify the formation of a percept in either modality. For example, perception of auditory speech is improved when the speaker's facial articulatory movements are visible. Neural convergence onto multisensory sites exhibiting supra-additivity has been proposed as the principal mechanism for integration. Recent findings, however, have suggested that putative sensory-specific cortices are responsive to inputs presented through a different modality. Consequently, when and where audiovisual representations emerge remain unsettled. In combined psychophysical and electroencephalography experiments we show that visual speech speeds up the cortical processing of auditory signals early (within 100 ms of signal onset). The auditory–visual interaction is reflected as an articulator-specific temporal facilitation (as well as a nonspecific amplitude reduction). The latency facilitation systematically depends on the deg...

Visual speech perception without primary auditory cortex activation

Neuroreport, 2002

Speech perception is conventionally thought to be an auditory function, but humans often use their eyes to perceive speech. We investigated whether visual speech perception depends on processing by the primary auditory cortex in hearing adults. In a functional magnetic resonance imaging experiment, a pulse-tone was presented contrasted with gradient noise. During the same session, a silent video of a talker saying isolated words was presented contrasted with a still face. Visual speech activated the superior temporal gyrus anterior, posterior, and lateral to the primary auditory cortex, but not the region of the primary auditory cortex. These results suggest that visual speech perception is not critically dependent on the region of primary auditory cortex. NeuroReport 13:311^315

Modulation of the primary auditory thalamus when recognising speech in noise

2019

Recognising speech in background noise is a strenuous daily activity, yet most humans can master it. A mechanistic explanation of how the human brain deals with such sensory uncertainty is the Bayesian Brain Hypothesis. In this view, the brain uses a dynamic generative model to simulate the most likely trajectory of the speech signal. Such simulation account can explain why there is a task-dependent modulation of sensory pathway structures (i.e., the sensory thalami) for recognition tasks that require tracking of fast-varying stimulus properties (i.e., speech) in contrast to relatively constant stimulus properties (e.g., speaker identity) despite the same stimulus input. Here we test the specific hypothesis that this task-dependent modulation for speech recognition increases in parallel with the sensory uncertainty in the speech signal. In accordance with this hypothesis, we show—by using ultra-high-resolution functional magnetic resonance imaging in human participants—that the task...

Dual neural routing of visual facilitation in speech processing

The Journal of …, 2009

Viewing our interlocutor facilitates speech perception, unlike for instance when we telephone. Several neural routes and mechanisms could account for this phenomenon. Using magnetoencephalography, we show that when seeing the interlocutor, latencies of auditory responses (M100) are the shorter the more predictable speech is from visual input, whether the auditory signal was congruent or not. Incongruence of auditory and visual input affected auditory responses ϳ20 ms after latency shortening was detected, indicating that initial content-dependent auditory facilitation by vision is followed by a feedback signal that reflects the error between expected and received auditory input (prediction error). We then used functional magnetic resonance imaging and confirmed that distinct routes of visual information to auditory processing underlie these two functional mechanisms. Functional connectivity between visual motion and auditory areas depended on the degree of visual predictability, whereas connectivity between the superior temporal sulcus and both auditory and visual motion areas was driven by audiovisual (AV) incongruence. These results establish two distinct mechanisms by which the brain uses potentially predictive visual information to improve auditory perception. A fast direct corticocortical pathway conveys visual motion parameters to auditory cortex, and a slower and indirect feedback pathway signals the error between visual prediction and auditory input.

Motor speech perception modulates the cortical language areas

NeuroImage, 2008

Traditionally, the left frontal and parietal lobes have been associated with language production while regions in the temporal lobe are seen as crucial for language comprehension. However, recent evidence suggests that the classical language areas constitute an integrated network where each area plays a crucial role both in speech production and perception. We used functional MRI to examine whether observing speech motor movements (without auditory speech) relative to non-speech motor movements preferentially activates the cortical speech areas. Furthermore, we tested whether the activation in these regions was modulated by task difficulty. This dissociates between areas that are actively involved with speech perception from regions that show an obligatory activation in response to speech movements (e.g. areas that automatically activate in preparation for a motoric response). Specifically, we hypothesized that regions involved with decoding oral speech would show increasing activation with increasing difficulty. We found that speech movements preferentially activate the frontal and temporal language areas. In contrast, non-speech movements preferentially activate the parietal region. Degraded speech stimuli increased both frontal and parietal lobe activity but did not differentially excite the temporal region. These findings suggest that the frontal language area plays a role in visual speech perception and highlight the differential roles of the classical speech and language areas in processing others' motor speech movements.

Modulation of tonotopic ventral MGB is behaviorally relevant for speech recognition

2019

Sensory thalami are central sensory pathway stations for information processing. Their role for human cognition and perception, however, remains unclear. Recent evidence suggests a specific involvement of the sensory thalami in speech recognition. In particular, the auditory thalamus (medial geniculate body, MGB) response is modulated by speech recognition tasks and the amount of this task-dependent modulation is associated with speech recognition abilities. Here we tested the specific hypothesis that this behaviorally relevant modulation is present in the MGB subsection that corresponds to the primary auditory pathway (i.e., the ventral MGB [vMGB]). We used ultra-high field 7T fMRI to identify the vMGB, and found a significant positive correlation between the amount of task-dependent modulation and the speech recognition performance across participants within left vMGB, but not within the other MGB subsections. These results imply that modulation of thalamic driving input to the au...