Audiovisual Integration of Reduced Information Speech Stimuli
MetadataShow full item record
Publisher:The Ohio State University
Series/Report no.:The Ohio State University. Department of Speech and Hearing Science Honors Theses; 2008
Every day, without knowing it, we are using more than one sense to perceive speech. Speech perception is a combined effort using not only auditory cues, but visual cues as well. This has been observed in situations where one of the cues is impaired, leading to a reliance on the other cue to fill in the missing pieces. An example of this would be a noisy environment where the auditory cue is difficult to interpret, and as a result, the individual will start to depend on his or her ability to interpret the visual cue. It has been found, however, that even when the auditory signal remains intact, individuals will still use their visual cues and fuse the two responses together. This is shown in the McGurk Effect, in which listeners were presented with an auditory stimulus of “ba” and a visual stimulus of “ga,” with the result that most listeners perceived “da,” a fusion of the two places of articulation. Numerous additional studies have investigated the integration of auditory and visual cues in more detail. In general, three different aspects of the process have been identified as important determinants of audiovisual integration. Those aspects include talker characteristics, listener characteristics, and the effect of degrading the auditory stimulus. Previous studies in our lab have demonstrated the effects of degrading the auditory stimulus by reducing its spectral fine structure. Even with as few as four spectral channels of information, subjects have found these stimuli highly intelligible. However, another means of reducing spectral information in speech, a reduction to a series of three sine waves that follow the general formant structure of the stimulus, was found by our subjects to be far less intelligible. Because these previous studies employed different groups of subjects, it is possible that observed differences in performance could be attributable to aspects other than the reduced waveforms themselves. The present study addressed this question by performing a within-subjects comparison of intelligibility for these two types of auditory stimuli. In addition, we evaluated the potential priming effects that the order of the stimulus presentation had on performance for these two types of stimuli. 6 talkers and 12 listeners participated in this study. The 12 listeners were separated into three different groups, four participants to a group. The type of auditory stimulus and the order in which it was presented varied across groups. The two different stimuli used in this study were 2-filter degraded speech and sine wave speech. The stimuli were 8 CVC syllables, all of which had the same medial vowel and differed in only the initial consonant. The first group was presented the stimuli in an alternating order, i.e., the listeners listened to 2-filter degraded speech of a talker and then listened to sine wave speech of the same talker. The second group listened to all of the sine wave stimuli first and then listened to all the 2-filter degraded stimuli. The third group listened to all of the 2-filter degraded stimuli first and then listened to all of the sine wave stimuli. Each participant was tested under auditory-only presentation, followed by auditory plus visual presentation for each stimulus type. Results demonstrated that participants performed far better with 2-filter speech than sine wave speech. However, the order in which the stimulus was presented did not have a significant impact on the performance of the participants. Interestingly, subjects showed more audiovisual integration for sine wave speech than for the 2-filter speech, suggesting that a more highly degraded auditory stimulus promotes greater integration.
This project was supported by an ASC Undergraduate Scholarship and an SBS Undergraduate Research Scholarship.
Items in Knowledge Bank are protected by copyright, with all rights reserved, unless otherwise indicated.