【外部採録】国際会議Interspeech2023に12本の論文が採録されました。
・Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget, Shoko Araki, ” Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization”
・Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani, ” Target Speaker Extraction with Conditional Diffusion Model”
・Shoko Araki, Ayako Yamamoto, Tsubasa Ochiai, Kenichi Arai, Atsunori Ogawa, Tomohiro Nakatani, Toshio Irino,” Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine”
・Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo,” Downstream Task Agnostic Speech Enhancement Conditioned on Self-Supervised Representation Loss”
・Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami,” Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data”
・Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma, ” SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?”
・Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, ” Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization”
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki,” iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN”
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino,” Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation”
・Kou Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Shogo Seki,” CFVC: Conditional Filtering for Controllable Voice Conversion”
・Hikaru Yanagida, Yusuke Ijima, Naohiro Tawara, "Influence of Personal Traits on Impressions of One's Own Voice"
・Yuki Kitagishi, Naohiro Tawara, Atsunori Ogawa, Ryo Masumura, Taichi Asami, "What are differences? Comparing DNN and human by their performance and characteristics in speaker age estimation"