Visual and Auditory Characteristics of Talkers in Multimodal Integration

Title: Visual and Auditory Characteristics of Talkers in Multimodal Integration
Creators: Shepard, Kyle
Advisor: Weisenberger, Janet
Issue Date: 2009-06
Abstract: In perceiving speech, there are three different elements of the interaction that can affect how the signal is interpreted: the talker, the signal (both the visual and auditory) and the listener. Each of these elements inherently contains substantial variability, which will, in turn, affect the audio-visual speech percept. Since the work of McGurk in the 1960s, which showed that speech perception is a multimodal process that incorporates both auditory and visual cues, there have been numerous investigations on the impact of these elements on multimodal integration of speech. The impact of talker characteristics on audio-visual integration has received the least amount of attention to date. A recent study by Andrews (2007) provided an initial look at talker characteristics. In her study, audiovisual integration produced by 14 talkers was examined, and substantial differences across talkers were found in both auditory and audiovisual intelligibility. However, talker characteristics that promoted audiovisual integration were not specifically identified. The present study began to address this question by analyzing audiovisual integration performance using two types of reduced-information speech syllables produced by five talkers. In one reduction, fine-structure information was replaced with band-limited noise but the temporal envelope was retained, and in the other, the syllables were reduced to a set of three sine waves that followed the formant structure of the syllable (sine-wave speech). Syllables were presented under audio-visual conditions to 10 listeners. Results indicated substantial across-talker differences, with the pattern of talker differences not affected by the type of reduction of the auditory signal. Analysis of confusion matrices provided directions for further analysis of specific auditory and visual speech tokens.
Series/Report no.: The Ohio State University. Department of Speech and Hearing Science Honors Theses; 2009
Keywords: multimodal integration
audiovisual integration
speech perception
talker differences
auditory speech reduction
2-channel filtered speech
sine wave speech
across-talker differences
