Speech recorded in a room with distant microphones is usually distorted due to reverberation caused by the reflection of the sound on the walls. In addition, noise also affects the quality of recorded speech. It is difficult for people and machines to understand such noisy and reverberant speech clearly. To tackle this problem, we have developed a recognition system that combines advanced speech dereverberation, noise reduction and deep learning-based automatic speech recognition. Our proposed system greatly improves automatic speech recognition performance and achieved top score in an international reverberant speech recognition competition. This achievement opens the way for more natural interaction with computers or robots.

Please click the thumbnail image to open the full-size PDF file.
Marc Delcroix
Media Information Laboratory
Takuya Yoshioka
Media Information Laboratory
Atsunori Ogawa
Media Information Laboratory
Masakiyo Fujimoto
Media Information Laboratory
Nobutaka Ito
Media Information Laboratory
Espi Miquel
Media Information Laboratory
Takaaki Hori
Media Information Laboratory