Optimizing Acoustic Array Beamforming to Aid a Speech Recognition System

Please use this identifier to cite or link to this item: http://hdl.handle.net/1811/52868

Show full item record

Files Size Format View
ECE_683H_HONORS_THESIS.pdf 1.884Mb PDF View/Open

Title: Optimizing Acoustic Array Beamforming to Aid a Speech Recognition System
Creators: Preobrazhensky, Sergei
Advisor: Potter, Lee
Issue Date: 2012-08
Abstract: The iBrutus is a pilot project at the Computer Science and Engineering (CSE) department at OSU which develops human-computer interaction via spoken dialog. The goal of the iBrutus project is to design a kiosk with a talking avatar on a screen which will answer questions at a public event like a football game at a potentially noisy environment like the Ohio Stadium. In such an environment, the speech recognition software employed by the system would be ineffective without prior processing to obtain a cleaner speech signal. As a rule of thumb, if the iBrutus could correctly interpret 70% or more words, it could successfully to map the input to a known question/command. To improve the speech recognition rate the author has chosen to research a beamforming algorithm. Such an algorithm combines inputs of from a microphone array to minimize the interference while preserving the desired signal (i.e. speech arriving from a known direction/location). The goal of the research has been to develop such an algorithm and a means of testing to determine which parameters associated with the algorithm – such as the spatial geometry of the microphone array – will produce the desired speech recognition rate in minimum processing time. The beamforming algorithm designed by the author in MATLAB was frequency based wideband Minimum Variance Distortionless Response (MVDR). Tests showed that at least 70% word recognition rate could be achieved under certain parameter choices. The processing time of the MATLAB-based algorithm is currently larger than desired for use with iBrutus, but there is potential for improvement.
Embargo: No embargo
Series/Report no.: The Ohio State University. Department of Electrical and Computer Engineering Honors Theses; 2012
Keywords: Signal Processing
Speech Recognition
Speech Processing
URI: http://hdl.handle.net/1811/52868
Bookmark and Share