Optimizing Acoustic Array Beamforming to Aid a Speech Recognition System

Loading...
Thumbnail Image

Date

2012-08

Journal Title

Journal ISSN

Volume Title

Publisher

The Ohio State University

Research Projects

Organizational Units

Journal Issue

Abstract

The iBrutus is a pilot project at the Computer Science and Engineering (CSE) department at OSU which develops human-computer interaction via spoken dialog. The goal of the iBrutus project is to design a kiosk with a talking avatar on a screen which will answer questions at a public event like a football game at a potentially noisy environment like the Ohio Stadium.

In such an environment, the speech recognition software employed by the system would be ineffective without prior processing to obtain a cleaner speech signal. As a rule of thumb, if the iBrutus could correctly interpret 70% or more words, it could successfully to map the input to a known question/command. To improve the speech recognition rate the author has chosen to research a beamforming algorithm. Such an algorithm combines inputs of from a microphone array to minimize the interference while preserving the desired signal (i.e. speech arriving from a known direction/location).

The goal of the research has been to develop such an algorithm and a means of testing to determine which parameters associated with the algorithm – such as the spatial geometry of the microphone array – will produce the desired speech recognition rate in minimum processing time. The beamforming algorithm designed by the author in MATLAB was frequency based wideband Minimum Variance Distortionless Response (MVDR). Tests showed that at least 70% word recognition rate could be achieved under certain parameter choices. The processing time of the MATLAB-based algorithm is currently larger than desired for use with iBrutus, but there is potential for improvement.

Description

Keywords

Signal Processing, Beamforming, Speech Recognition, Acoustics, MVDR, Speech Processing

Citation