Multisensory training can promote or impede visual perceptual learning of speech stimuli: visual-tactile vs. visual-auditory training (original) (raw)

Multiple Routes to Perceptual Learning

A listener's ability to utilize indexical information in the speech signal can enhance their performance on a variety of speech perception tasks. It is unclear, however, whether such information plays a similar role for spectrally reduced speech signals, such as those experienced by individuals with cochlear implants. The present study compared the effects of training on linguistic versus indexical tasks when adapting to cochlear implant simulations. Listening to sentences processed with an 8-channel sinewave vocoder, three groups of subjects were trained on a transcription task (Transcription), a talker identification task (Talker ID) or a gender identification task (Gender ID). Pre-to post-test comparisons demonstrated that training produced significant improvement for all groups. Moreover, subjects from the Talker ID and Transcription training groups performed similarly at post-test and generalization, and significantly better than the subjects from the Gender ID training group. These data suggest that training on an indexical task that requires high levels of attention can provide equivalent benefit to training on a linguistic task. When listeners selectively focus their attention on the extralinguistic information in the speech signal, they still extract linguistic information, the degree to which they do so, however, appears to be task dependent.

Reverse hierarchies and sensory learning

2009

Revealing the relationships between perceptual representations in the brain and mechanisms of adult perceptual learning is of great importance, potentially leading to significantly improved training techniques both for improving skills in the general population and for ameliorating deficits in special populations. In this review, we summarize the essentials of reverse hierarchy theory for perceptual learning in the visual and auditory modalities and describe the theory's implications for designing improved training procedures, for a variety of goals and populations.

Verbal and novel multisensory associative learning in adults

F1000Research, 2013

To date, few studies have focused on the behavioural differences between the learning of multisensory auditory-visual and intra-modal associations. More specifically, the relative benefits of novel auditory-visual and verbal-visual associations for learning have not been directly compared. In Experiment 1, 20 adult volunteers completed three paired associate learning tasks: non-verbal novel auditory-visual (novel-AV), verbal-visual (verbal-AV; using pseudowords), and visual-visual (shape-VV). Participants were directed to make a motor response to matching novel and arbitrarily related stimulus pairs. Feedback was provided to facilitate trial and error learning. The results of Signal Detection Theory analyses suggested a multisensory enhancement of learning, with significantly higher discriminability measures (d-prime) in both the novel-AV and verbal-AV tasks than the shape-VV task. Motor reaction times were also significantly faster during the verbal-AV task than during the non-verbal learning tasks. Experiment 2 (n = 12) used a forced-choice discrimination paradigm to assess whether a difference in unisensory stimulus discriminability could account for the learning trends in Experiment 1. Participants were significantly slower at discriminating unisensory pseudowords than the novel sounds and visual shapes, which was notable given that these stimuli produced superior learning. Together the findings suggest that verbal information has an added enhancing effect on multisensory associative learning in adults Sheila G Crewther (

Auditory Perceptual Learning for Speech Perception Can be Enhanced by Audiovisual Training

Frontiers in Neuroscience, 2013

Speech perception under audiovisual (AV) conditions is well known to confer benefits to perception such as increased speed and accuracy. Here, we investigated how AV training might benefit or impede auditory perceptual learning of speech degraded by vocoding. In Experiments 1 and 3, participants learned paired associations between vocoded spoken nonsense words and nonsense pictures. In Experiment 1, paired-associates (PA) AV training of one group of participants was compared with audio-only (AO) training of another group. When tested under AO conditions, the AV-trained group was significantly more accurate than the AO-trained group. In addition, pre-and post-training AO forced-choice consonant identification with untrained nonsense words showed that AV-trained participants had learned significantly more than AO participants. The pattern of results pointed to their having learned at the level of the auditory phonetic features of the vocoded stimuli. Experiment 2, a no-training control with testing and re-testing on the AO consonant identification, showed that the controls were as accurate as the AO-trained participants in Experiment 1 but less accurate than the AV-trained participants. In Experiment 3, PA training alternated AV and AO conditions on a list-by-list basis within participants, and training was to criterion (92% correct). PA training with AO stimuli was reliably more effective than training with AV stimuli. We explain these discrepant results in terms of the so-called "reverse hierarchy theory" of perceptual learning and in terms of the diverse multisensory and unisensory processing resources available to speech perception. We propose that early AV speech integration can potentially impede auditory perceptual learning; but visual top-down access to relevant auditory features can promote auditory perceptual learning.

The reverse hierarchy theory of visual perceptual learning

Trends in Cognitive Sciences, 2004

Perceptual learning can be defined as practice-induced improvement in the ability to perform specific perceptual tasks. We previously proposed the Reverse Hierarchy Theory as a unifying concept that links behavioral findings of visual learning with physiological and anatomical data. Essentially, it asserts that learning is a top-down guided process, which begins at high-level areas of the visual system, and when these do not suffice, progresses backwards to the input levels, which have a better signal-to-noise ratio. This simple concept has proved powerful in explaining a broad range of findings, including seemingly contradicting data. We now extend this concept to describe the dynamics of skill acquisition and interpret recent behavioral and electrophysiological findings.

Acquisition versus Consolidation of Auditory Perceptual Learning Using Mixed-Training Regimens

PloS one, 2015

Learning is considered to consist of two distinct phases-acquisition and consolidation. Acquisition can be disrupted when short periods of training on more than one task are interleaved, whereas consolidation can be disrupted when a second task is trained after the first has been initiated. Here we investigated the conditions governing the disruption to acquisition and consolidation during mixed-training regimens in which primary and secondary amplitude modulation tasks were either interleaved or presented consecutively. The secondary task differed from the primary task in either task-irrelevant (carrier frequency) or task-relevant (modulation rate) stimulus features while requiring the same perceptual judgment (amplitude modulation depth discrimination), or shared both irrelevant and relevant features but required a different judgment (amplitude modulation rate discrimination). Based on previous literature we predicted that acquisition would be disrupted by varying the task-relevant stimulus feature during training (stimulus interference), and that consolidation would be disrupted by varying the perceptual judgment required (task interference). We found that varying the task-relevant or -irrelevant stimulus features failed to disrupt acquisition but did disrupt consolidation, whereas mixing two tasks requiring a different perceptual judgment but sharing the same stimulus features disrupted both acquisition and consolidation. Thus, a distinction between acquisition and consolidation phases of perceptual learning cannot simply be attributed to (task-relevant) stimulus versus task interference. We propose instead that disruption occurs during acquisition when mixing two tasks requiring a perceptual judgment based on different cues, whereas consolidation is always disrupted regardless of whether different stimulus features or tasks are mixed. The current study not only provides a novel insight into the underlying mechanisms of perceptual learning, but also has practical implications for the optimal design and delivery of training programs that aim to remediate perceptual difficulties.

Perceptual Training Narrows the Temporal Window of Multisensory Binding

Journal of Neuroscience, 2009

The brain's ability to bind incoming auditory and visual stimuli depends critically on the temporal structure of this information. Specifically, there exists a temporal window of audiovisual integration within which stimuli are highly likely to be bound together and perceived as part of the same environmental event. Several studies have described the temporal bounds of this window, but few have investigated its malleability. Here, the plasticity in the size of this temporal window was investigated using a perceptual learning paradigm in which participants were given feedback during a two-alternative forced-choice (2-AFC) audiovisual simultaneity judgment task. Training resulted in a marked (i.e., approximately 40%) narrowing in the size of the window. To rule out the possibility that this narrowing was the result of changes in cognitive biases, a second experiment employing a two-interval forced choice (2-IFC) paradigm was undertaken during which participants were instructed to identify a simultaneously-presented audiovisual pair presented within one of two intervals. The 2-IFC paradigm resulted in a narrowing that was similar in both degree and dynamics to that using the 2-AFC approach. Together, these results illustrate that different methods of multisensory perceptual training can result in substantial alterations in the circuits underlying the perception of audiovisual simultaneity. These findings suggest a high degree of flexibility in multisensory temporal processing and have important implications for interventional strategies that may be used to ameliorate clinical conditions (e.g., autism, dyslexia) in which multisensory temporal function may be impaired.

Auditory-Visual stimuli: Effects on derived relations with compound stimuli

Behavioral Interventions, 2020

This research explored the effect of teaching conditional discriminations with three procedures on the derivation of 36 stimuli relations (derived relations). The stimuli used consisted of three characteristics musical instruments, along with the corresponding picture. In the first experiment six university students were trained with simple stimuli and tested with compound auditory–visual samples; therefore, a one‐to‐many structure was used. In the second experiment, auditory stimuli were replaced by visual stimuli, for the samples used, for new students. A third experiment was implemented with an extra phase of training with compound stimuli for six new students. The structure of the experiments was: pretests (Xbcd–A; Xacd–B; Xabd–C; Xabc–D), training (A–B; A–C; A–D), and posttests (same as pretests). The difference between these conditions was the kind of stimuli used and a new phase of teaching used in condition 3: (Xbcd–A). The results indicate that training with simple stimuli on discriminations that include stimuli that are easy to discriminate from each other (words and sounds) is a sufficient condition for good posttest performance. However, when comparisons are made difficult (words only), participants show better performance on new tests if they have a learning history with compound stimuli.

Multisensory perceptual learning and sensory substitution

Neuroscience and biobehavioral reviews, 2012

One of the most exciting recent findings in neuroscience has been the capacity for neural plasticity in adult humans and animals. Studies of perceptual learning have provided key insights into the mechanisms of neural plasticity and the changes in functional neuroanatomy that it affords. Key questions in this field of research concern how practice of a task leads to specific or general improvement. Although much of this work has been carried out with a focus on a single sensory modality, primarily visual, there is increasing interest in multisensory perceptual learning. Here we will examine how advances in perceptual learning research both inform and can be informed by the development and advancement of sensory substitution devices for blind persons. To allow ‘sight’ to occur in the absence of visual input through the eyes, visual information can be transformed by a sensory substitution device into a representation that can be processed as sound or touch, and thus give one the potential to ‘see’ through the ears or tongue. Investigations of auditory, visual and multisensory perceptual learning can have key benefits for the advancement of sensory substitution, and the study of sensory deprivation and sensory substitution likewise will further the understanding of perceptual learning in general and the reverse hierarchy theory in particular. It also has significant importance for the developing understanding of the brain in metamodal terms, where functional brain areas might be best defined by the computations they carry out rather than by their sensory-specific processing role.► The reverse hierarchy theory provides a behavioral model of perceptual learning. ► The metamodal hypothesis provides a model of computationally defined brain areas. ► We unify these behavioral and neural models for multisensory perceptual learning. ► Sensory substitution can benefit from and advance perceptual learning research.