信号処理研究グループ｜NTTコミュニケーション科学基礎研究所

2024

論文

Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki & Shoji Makino (2024). DOA-Informed Switching Independent Vector Extraction and Beamforming for Speech Enhancement in Underdetermined Situations. EURASIP Journal on Audio, Speech, and Music Processing, 2024.
Takanori Ashihara, Marc Delcroix, Yusuke Ijima & Makio Kashino (2024). Unveiling the Linguistic Capabilities of a Self-Supervised Speech Model Through Cross-Lingual Benchmark and Layer- Wise Similarity Analysis. IEEE Access, 12, 98835-98855.
Reinhold Haeb-Umbach, Tomohiro Nakatani, Marc Delcroix, Christoph Boeddeker & Tsubasa Ochiai (2024). Microphone Array Signal Processing and Deep Learning for Speech Enhancement: Combining model-based and data-driven approaches to parameter estimation and filtering. IEEE Signal Processing Magazine, 41 (6), 12-23.
Rintaro Ikeshita & Tomohiro Nakatani (2024). Geometrically-Regularized Fast Independent Vector Extraction by Pure Majorization-Minimization. IEEE Transactions on Signal Processing, 72, 1560-1575.
Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki & Shigeru Katagiri (2024). Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32, 3589-3602.
Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki & Shoji Makino (2024). Blind and Spatially-Regularized Online Joint Optimization of Source Separation, Dereverberation, and Noise Reduction. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 32, 1157-1172.

国際会議予稿

Hao Shi, Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani & Shoko Araki (2024). ENSEMBLE INFERENCE FOR DIFFUSION MODEL-BASED SPEECH ENHANCEMENT. ICASSP2024 Satellite Workshop on Hands-Free Speech Communication and Microphone Array (HSCMA). Seoul, Korea.
Thilo von Neumann, Christoph Cord-Landwehr Boeddeker, Marc Delcroix & Reinhold Haeb-Umbach (2024). Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization. ICASSP2024 Satellite Workshop on Hands-Free Speech Communication and Microphone Array (HSCMA). Seoul, Korea.
Rino Kimura, Tomohiro Nakatani, Naoyuki Kamo, Delcroix Marc, Shoko Araki, Tetsuya Ueda & Shoji Makino (2024). Diffusion model-based MIMO speech denoising and dereverberation. ICASSP2024 Satellite Workshop on Hands-Free Speech Communication and Microphone Array (HSCMA) Workshop. Seoul, Korea.
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki & Jan Cernocky (2024). Probing Self-supervised Learning Models with Target Speech Extraction. ICASSP2024 Satellite Workshop on Self-supervision in Audio, Speech, and Beyond (SASB). Seoul, Korea.
Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami & Yusuke Ijima (2024). What do self-supervised speech and speaker models learn? New findings from a cross model layer-wise analysis. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix & Shinji Watanabe (2024). Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya & Yusuke Ijima (2024). Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki & Shigeru Katagiri (2024). How does end-to-end speech recognition training impact speech enhancement artifacts?. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Dominik Klement, Mireia Diez, Federico Landini, Lukáš Burget, Anna Silnova, Marc Delcroix & Naohiro Tawara (2024). Discriminative Training of VBx Diarization. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki & Jan Cernocky (2024). Target Speech Extraction with pre-trained self-supervised learning models. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada & Shoji Makino (2024). Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Naohiro Tawara, Marc Delcroix, Atsushi Ando & Atsunori Ogawa (2024). NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Keigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki & Akira Nakayama (2024). Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal Teacher. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
Kenichi Fujita, Takanori Ashihara, Marc Delcroix & Yusuke Ijima (2024). Lightweight Zero-shot Text-to-Speech with Mixture of Adapters. Interspeech2024. Kos Island, Greece.
Keigo Hojo, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa & Norihide Kitaoka (2024). Boosting CTC-based ASR using inter-layer attention-based CTC loss. Interspeech2024. Kos Island, Greece.
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Masato Mimura, Takatomo Kano, Atsunori Ogawa & Marc Delcroix (2024). Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation. Interspeech2024. Kos Island, Greece.
Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama & Marc Delcroix (2024). SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling. Interspeech2024. Kos Island, Greece.
Tatsunari Takagi, Yukoh Wakabayashi, Atsunori Ogawa & Norihide Kitaoka (2024). Text-only domain adaptation for CTC-based speech recognition through substitution of implicit linguistic information in the search space. Interspeech2024. Kos Island, Greece.
Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki & Simon Doclo (2024). Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers. Interspeech2024. Kos Island, Greece.
Thilo von Neumann, Christoph Boeddeker, Marc Delcroix & Reinhold Haeb-Umbach (2024). MeetEval, Show Me the Errors! Interactive Visualization of Transcript Alignments for the Analysis of Conversational ASR. Show & Tell Demo, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Seoul, Korea.

2023

論文

Katerina Zmolikova, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocky & Dong Yu (2023). Neural Rarget Speech Extraction: An Overview. IEEE Signal Processing Magazine, 40 (3), 8-29.
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani & Shoko Araki (2023). Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking. IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 31, 835-848.
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix & Takahiro Shinozaki (2023). Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection. IEEE Access, 11, 13906-13917.

国際会議予稿

Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Roshan Sharma, Kohei Matsuura & Shinji Watanabe (2023). Speech Summarization of Long Spoken Document: Improving Memory Efficiency of Speech/Text Encoders. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes, Greek.
Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara & Marc Delcroix (2023). Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes, Greek.
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix & Ryo Masumura (2023). Leveraging Large Text Corpora for End-to-End Speech Summarization. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes, Greek.
Thilo von Neumann, Christoph Boeddeker, Keisuke Kinoshita, Marc Delcroix & Reinhold Haeb-Umbach (2023). On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). island of Rhodes, Greek.
Taishi Nakashima, Rintaro Ikeshita, Nobutaka Ono, Shoko Araki & Tomohiro Nakatani (2023). Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). island of Rhodes, Greek.
Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget & Shoko Araki (2023). Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. Proc. Interspeech. Dublin, Ireland.
Naoyuki Kamo, Marc Delcroix & Tomohiro Nakatani (2023). Target Speaker Extraction with Conditional Diffusion Model. Proc. Interspeech. Dublin, Ireland.
Shoko Araki, Ayako Yamamoto, Tsubasa Ochiai, Kenichi Arai, Atsunori Ogawa, Tomohiro Nakatani & Toshio Irino (2023). Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine. Proc. Interspeech. Dublin, Ireland.
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka & Nobukatsu Hojo (2023). Downstream Task Agnostic Speech Enhancement Conditioned on Self-Supervised Representation Loss. Proc. Interspeech. Dublin, Ireland.
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa & Taichi Asami (2023). Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data. Proc. Interspeech. Dublin, Ireland.
Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix & Yukinori Honma (2023). SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?. Proc. Interspeech. Dublin, Ireland.
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa & Marc Delcroix (2023). Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization. Proc. Interspeech. Dublin, Ireland.
Hikaru Yanagida, Yusuke Ijima & Naohiro Tawara (2023). Influence of Personal Traits on Impressions of One's Own Voice. Proc. Interspeech. Dublin, Ireland.
Yuki Kitagishi, Naohiro Tawara, Atsunori Ogawa, Ryo Masumura & Taichi Asami (2023). What are differences? Comparing DNN and human by their performance and characteristics in speaker age estimation. Proc. Interspeech. Dublin, Ireland.
Yuki Kitagishi, Hosana Kamiyama, Naohiro Tawara, Atsunori Ogawa, Noboru Miyazaki & Taichi Asami (2023). Coarse-age loss: A new training method using coarse-age labeled data for speaker age estimation. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.
Koharu Horii, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa & Norihide Kitaoka (2023). Language modeling for spontaneous speech recognition based on disfluency labeling and generation of disfluent text. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.
Keigo Hojo, Daiki Mori, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa & Norihide Kitaoka (2023). Combining multiple end-to-end speech recognition models based on density ratio approach. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.
Tatsunari Takagi, Atsunori Ogawa, Norihide Kitaoka & Yukoh Wakabayashi (2023). Streaming end-to-end speech recognition using a CTC decoder with substituted linguistic information. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.

2022

論文

Wangyou Zhang, Xuankai Chang, Christoph Boeddeker, Tomohiro Nakatani, Shinji Watanabe & Yanmin Qian (2022). End-to-end dereverberation, beamforming, and speech recognition in a cocktail party. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 30, 3173-3188.
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi & Shoko Araki (2022). Soundbeam: target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo & Shoko Araki (2022). Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 1032-1047.
Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix & Reinhold Haeb-Umbach (2022). Segment-Less Continuous Speech Separation of Meetings: Training and Evaluation Criteria. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, 576-589.
Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Hiroto Ashihara, Tetsunori Kobayashi & Tetsuji Ogawa (2022). Multi-Source Domain Generalization Using Domain Attributes for Recurrent Neural Network Language Models. IEICE Transactions on Information and Systems, E105.D (1), 150-160.
Zili Huang, Marc Delcroix, Leibny Paola Garcia, Shinji Watanabe, Desh Raj & Sanjeev Khudanpur (2022). Joint speaker diarization and speech recognition based on region proposal networks. Computer Speech & Language, 72, 101316.

国際会議予稿

Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada & Kunio Kashino (2022). ConceptBeam: Concept driven target speech extraction. Proc. ACM International Conference on Multimedia(ACMMM). Lisbon, Portugal.
Hiroshi Sawada, Rintaro Ikeshita, Keisuke Kinoshita & Tomohiro Nakatani (2022). Multi-Frame Full-Rank Spatial Covariance Analysis for Underdetermined BSS in Reverberant Environments. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Naoyuki Kamo, Rintaro Ikeshita, Keisuke Kinoshita & Tomohiro Nakatani (2022). Importance of Switch Optimization Criterion in Switching WPE Dereverberation. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Naoyuki Kamo & Takafumi Moriya (2022). Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Takatomo Kano, Atsunori Ogawa, Marc Delcroix & Shinji Watanabe (2022). Integrating Multiple ASR Systems into NLP Backend with Attention Fusion. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Atsunori Ogawa, Naohiro Tawara, Marc Delcroix & Shoko Araki (2022). Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Keisuke Kinoshita, Marc Delcroix & Tomoharu Iwata (2022). Tight Integration Of Neural- And Clustering-Based Diarization Through Deep Unfolding Of Infinite Gaussian Mixture Model. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix & Reinhold Haeb-Umbach (2022). SA-SDR: A Novel Loss Function for Separation of Meeting Style Data. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix & Takahiro Shinozaki (2022). Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki & Shigeru Katagiri (2022). How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR. Proc. Interspeech 2022.
Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolikova, Hiroshi Sato & Tomohiro Nakatani (2022). Listen only to me! How well can target speech extraction handle false alarms?. Proc. Interspeech 2022.
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka & Ryo Masumura (2022). Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations. Proc. Interspeech 2022.
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix & Takahiro Shinozaki (2022). Streaming Target-Speaker ASR with Neural Transducer. Proc. Interspeech 2022.
Martin Kocour, Katerina Zmolikova, Lucas Ondel, Jan Svec, Marc Delcroix, Tsubasa Ochiai, Lukas Burget & Jan Cernocky (2022). Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model. Proc. Interspeech 2022.
Koharu Horii, Meiko Fukuda, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa & Norihide Kitaoka (2022). End-to-End Spontaneous Speech Recognition Using Disfluency Labeling. Proc. Interspeech 2022.
Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Christoph Boeddeker & Reinhold Haeb-Umbach (2022). Utterance-by-utterance overlap-aware neural diarization with Graph-PIT. Proc. Interspeech 2022.
Rintaro Ikeshita & Tomohiro Nakatani (2022). ISS2: An Extension of Iterative Source Steering Algorithm for Majorization-Minimization-Based Independent Vector Analysis. 2022 30th European Signal Processing Conference (EUSIPCO).
Ján Švec, Kateřina Žmolíková, Martin Kocour, Marc Delcroix, Tsubasa Ochiai, Ladislav Mošner & Jan Honza Černocký (2022). Analysis of Impact of Emotions on Target Speech Extraction and Speech Separation. 2022 International Workshop on Acoustic Signal Enhancement (IWAENC).
Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada & Shoji Makino (2022). Neural Virtual Microphone Estimator: Application to Multi-Talker Reverberant Mixtures. 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
Naoyuki Kamo, Kenichi Arai, Atsunori Ogawa, Shoko Araki, Tomohiro Nakatani, Keisuke Kinoshita, Marc Delcroix, Tsubasa Ochiai & Toshio Irino (2022). Speech Intelligibility Prediction through Direct Estimation of Word Accuracy Using Conformer. 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
Kenichi Arai, Atsunori Ogawa, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani, Naoyuki Kamo & Toshio Irino (2022). Intelligibility prediction of enhanced speech using recognition accuracy of end-to-end ASR systems. 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
Ayako Yamamoto, Toshio Irino, Shoko Araki, Kenichi Arai, Atsunori Ogawa, Keisuke Kinoshita & Tomohiro Nakatani (2022). Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement. 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada & Kunio Kashino (2022). ConceptBeam: Concept Driven Target Speech Extraction. Proceedings of the 30th ACM International Conference on Multimedia.
Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo & Shoko Araki (2022). Switching Independent Vector Extraction and Its Joint Optimization with Weighted Prediction Error Dereverberation. Proc.~of 24th INTERNATIONAL congress on acoustics (ICA2022).
Takatomo Kano, Atsunori Ogawa, Marc Delcroix & Shinji Watanabe (2021). Attention-Based Multi-Hypothesis Fusion for Speech Summarization. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
Naohiro Tawara, Atsunori Ogawa, Yuki Kitagishi, Hosana Kamiyama & Yusuke Ijima (2021). Robust speech-age estimation using local maximum mean discrepancy under mismatched recording condition. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

2021

論文

T. Nakatani, C. R. Haeb-Umbach, J. Heymann, L.Drude, S. Watanabe, M. Delcroix, T. Nakatani, "Far-Field Automatic Speech Recognition," Proceedings of the IEEE, Volume: 109, Issue: 2, pp. 124-148, Feb. 2021.
N. Ito, R. Ikeshita, H. Sawada and T. Nakatani, "A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener Filter," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021.
R. Ikeshita, T. Nakatani and S. Araki, "Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction," IEEE Transactions on Signal Processing, 2021.
R. Ikeshita and T. Nakatani, "Independent Vector Extraction for Fast Joint Blind Source Separation and Dereverberation," IEEE Signal Processing Letters, vol. 28, pp. 972-976, 2021.
R. Ikeshita, N. Kamo and T. Nakatani, "Blind Signal Dereverberation Based on Mixture of Weighted Prediction Error Models," IEEE Signal Processing Letters, vol. 28, pp. 399-403, 2021.
Rintaro Ikeshita, Keisuke Kinoshita, Naoyuki Kamo & Tomohiro Nakatani (2021). Online Speech Dereverberation Using Mixture of Multichannel Linear Prediction Models. IEEE Signal Processing Letters, 28, 1580-1584.

国際会議予稿

C. Li, Y. Luo, C. Han, J. Li, T. Yoshioka, T. Zhou, M. Delcroix, K. Kinoshita, C. Boeddeker, Y. Qian, S. Watanabe, and Z. Chen, "Dual-Path RNN for Long Recording Speech Separation," in Proc. 2021 IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 865-872.
K. Zmolikova, M. Delcroix, L. Burget, T. Nakatani, and J. H. Černocky, "Integration of Variational Autoencoder and Spatial Clustering for Adaptive Multi-Channel Neural Speech Separation," in Proc. 2021 IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 889-896.
H. Sato, T. Ochiai, K. Kinoshita, M. Delcroix, T. Nakatani, S. Araki, "Multimodal Attention Fusion for Target Speaker Extraction," in Proc. IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 778-784.
C. Schymura, T. Ochiai, M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, D. Kolossa, "Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization," in Proc. 2020 28th European Signal Processing Conference (EUSIPCO) , 2021, pp. 231-235.
S. Watanabe, F. Boyer, X. Chang, P. Guo, T. Hayashi, Y. Higuchi, T. Hori, W. -C Huang, H. Inaguma, N. Kamo, S. Karita, C. Li, J. Shi, A. S. Subramanian, W. Zhang, "The 2020 ESPnet Update: New Features, Broadened Applications, Performance Improvements, and Future Plans," in Proc. 2021 IEEE Data Science & Learning Workshop (DSLW), 2021.
J. Wissing, B. Boenninghoff, D. Kolossa, T. Ochiai, M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, and C. Schymura "Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 4705-4709.
C. Li, Z. Chen, Y. Luo, C. Han, T. Zhou, K. Kinoshita, M. Delcroix, S. Watanabe, and Y. Qian, "Dual-Path Modeling for Long Recording Speech Separation in Meetings," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 5739-5743.
P. Guo, F. Boyer, X. Chang, T. Hayashi, Y. Higuchi, H. Inaguma, N. Kamo, C. Li, D. Garcia-Romero, J. Shi, J. Shi, S. Watanabe, K. Wei, W. Zhang, and Y. Zhang, "Recent Developments on ESPnet Toolkit Boosted by Conformer," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 5874-5878.
M. Delcroix, K. Zmolikova, T. Ochiai, K. Kinoshita, and T. Nakatani, "Speaker Activity Driven Neural Speech Extraction," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6099-6103.
T. Ochiai, M. Delcroix, T. Nakatani, R. Ikeshita, K. Kinoshita, and S. Araki, "Neural Network-Based Virtual Microphone Estimator," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6114-6118.
A. Ogawa, N. Tawara, T. Kano, and M. Delcroix, "BLSTM-Based Confidence Estimation for End-to-End Speech Recognition," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6383-6387.
N. Tawara, A. Ogawa, Y. Kitagishi, and H. Kamiyama, "Age-VOX-Celeb: Multi-Modal Corpus for Facial and Speech Estimation," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6963-6967.
K. Kinoshita, M. Delcroix and N. Tawara, "Integrating End-to-End Neural and Clustering-Based Diarization: Getting the Best of Both Worlds," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 7198-7202.
T. Moriya, T. Ashihara, T. Tanaka, T. Ochiai, H. Sato, A. Ando, Y. Ijima, R. Masumura, and Y. Shinohara, "SimpleFlat: A simple whole-network pre-training approach for RNN transducer-based end-to-end speech recognition," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 5664-5668.
C. Boeddeker, W. Zhang, T. Nakatani, K. Kinoshita, T. Ochiai, M. Delcroix, N. Kamo, Y. Qian, R. Haeb-Umbach, "Convolutive Transfer Function Invariant SDR Training Criteria for Multi-Channel Reverberant Speech Separation," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 8428-8432.
W. Zhang, C. Boeddeker, S. Watanabe, T. Nakatani, M. Delcroix, K. Kinoshita, T. Ochiai, N. Kamo, R. Haeb-Umbach, Y. Qian, "End-to-end dereverberation, beamforming, and speech recognition with improved numerical stability and advanced frontend," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 6898-6902.
Wangyou Zhang, Christoph Boeddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach & Yanmin Qian (2021). End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki & Shoji Makino (2021). Low Latency Online Blind Source Separation Based on Joint Optimization with Blind Dereverberation. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya & Naoyuki Kamo (2021). Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition. Proc. Interspeech 2021.
Takafumi Moriya, Tomohiro Tanaka, Takanori Ashihara, Tsubasa Ochiai, Hiroshi Sato, Atsushi Ando, Ryo Masumura, Marc Delcroix & Taichi Asami (2021). Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture. Proc. Interspeech 2021.
Christopher Schymura, Benedikt Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki & Dorothea Kolossa (2021). PILOT: Introducing Transformers for Probabilistic Sound Event Localization. Proc. Interspeech 2021.
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita & Shoko Araki (2021). Few-Shot Learning of New Sound Classes for Target Sound Extraction. Proc. Interspeech 2021.
Keisuke Kinoshita, Marc Delcroix & Naohiro Tawara (2021). Advances in Integration of End-to-End Neural and Clustering-Based Diarization for Real Conversational Speech. Proc. Interspeech 2021.
Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix & Reinhold Haeb-Umbach (2021). Graph-PIT: Generalized Permutation Invariant Training for Continuous Separation of Arbitrary Numbers of Speakers. Proc. Interspeech 2021.
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani & Zhuo Chen (2021). Continuous Speech Separation Using Speaker Inventory for Long Recording. Proc. Interspeech 2021.
Katerina Zmolikova, Marc Delcroix, Desh Raj, Shinji Watanabe & Jan Černocký (2021). Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics. Proc. Interspeech 2021.
Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita & Tomohiro Nakatani (2021). Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility. Proc. Interspeech 2021.
Yosuke Higuchi, Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Tetsunori Kobayashi & Tetsuji Ogawa (2021). Noise-robust Attention Learning for End-to-End Speech Recognition. 2020 28th European Signal Processing Conference (EUSIPCO).
Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki & Shoji Makino (2021). Low Latency Online Source Separation and Noise Reduction Based on Joint Optimization with Dereverberation. 2021 29th European Signal Processing Conference (EUSIPCO).
Naoki Narisawa, Rintaro Ikeshita, Norihiro Takamune, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari & Tomohiro Nakatani (2021). Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation. 2021 29th European Signal Processing Conference (EUSIPCO).
Hiroshi Sawada, Rintaro Ikeshita & Tomohiro Nakatani (2021). Experimental Analysis of EM and MU Algorithms for Optimizing Full-rank Spatial Covariance Model. 2020 28th European Signal Processing Conference (EUSIPCO).
Tomohiro Nakatani, Rintaro Ikeshita, Naoyuki Kamo, Keisuke Kinoshita, Shoko Araki & Sawada Hiroshi (2021). Switching convolutional beamformer. 2021 29th European Signal Processing Conference (EUSIPCO).
Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada & Shoko Araki (2021). Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Atsunori Ogawa, Naohiro Tawara, Takatomo Kano & Marc Delcroix (2021). BLSTM-based confidence estimation for end-to-end speech recognition. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Thilo von Neumann, Christoph Boeddeker, Keisuke Kinoshita, Marc Delcroix & Reinhold Haeb-Umbach (2021). Speeding up permutation invariant training for source separation. Speech Communication; 14th ITG Conference.

2020

論文

K. Yamamoto, T. Irino, S. Araki, K. Kinoshita, T. Nakatani, "Speech intelligibility prediction using a multi-resolution Gammachirp envelope distortion index with common parameters for different noise conditions," Acoust. Sci. & Tech., Vol. 41 (1), pp. 396-399, Jan 2020.
T Nakatani, C Boeddeker, K Kinoshita, R Ikeshita, M Delcroix, "Jointly optimal denoising, dereverberation, and source separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, volume 28, pp. 2267-2282, 2020.
S. Emura, H. Sawada, S. Araki, N. Harada, "Multi-delay sparse approach to residual crosstalk reduction for blind source separation, "IEEE Signal Processing Letters, vol. 27, pp.1630—1634, Sept. 2020.
N. Ito and S. Godsill, "A Multi-Target Track-Before-Detect Particle Filter Using Superpositional Data in Non-Gaussian Noise," IEEE Signal Processing Letters, vol. 27, pp. 1075-1079, 2020.

国際会議予稿

K. Kinoshita, T. Ochiai, M. Delcroix, and T. Nakatani, "Improving noise robust automatic speech recognition with single-channel time-domain enhancement network," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 7009-7013.
T. von Neumann, K. Kinoshita, L. Drude, C. Boeddeker, M. Delcroix, T. Nakatani, and R. Haeb-Umbach, "End-to-end training of time domain audio separation and recognition," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 7004-7008.
N. Tawara, A. Ogawa, T. Iwata, M. Delcroix, and T. Ogawa, "Frame-level phoneme-invariant speaker embedding for text-independent speaker recognition on extremely short utterances," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6799-6803.
N. Tawara, H. Kamiyama, S. Kobashikawa, and A. Ogawa, "Improving speaker-attribute estimation by voting based on speaker cluster information," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6594-6598.
T. Nakatani, R. Takahashi, T. Ochiai, K. Kinoshita, R. Ikeshita, M. Delcroix, and S. Araki, "DNN-supported mask-based convolutional beamforming for simultaneous denoising, dereverberation, and source separation," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6399-6403.
T. Ochiai, M. Delcroix, R. Ikeshita, K. Kinoshita, T. Nakatani, and S. Araki, "Beam-Tasnet: Time-domain audio separation network meets frequency-domain beamformer," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6384-6388.
M. Delcroix, T. Ochiai, K. Zmolikova, K. Kinoshita, N. Tawara, T. Nakatani, and S. Araki, "Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 691-695.
T. Kondo, K. Fukushige, N. Takamune, D. Kitamura, H. Saruwatari, R. Ikeshita, and T. Nakatani, "Convergence-guaranteed independent positive semidefinite tensor analysis based on student's t distribution," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 681-685.
R. Ikeshita, T. Nakatani, and S. Araki, "Overdetermined independent vector analysis," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 591-595.
C. Schymura, T. Ochiai, M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, and D. Kolossa, "A dynamic stream weight backprop Kalman filter for audiovisual speaker tracking," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 581-585.
K. Kinoshita, M. Delcroix, S. Araki, and T. Nakatani, "Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 381-385.
C. Boeddeker, T. Nakatani, K. Kinoshita, and R. Haeb-Umbach, "Jointly optimal dereverberation and beamforming," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 216-220.
Y. Koizumi, K. Yatabe, M. Delcroix, Y. Masuyama, and D. Takeuchi, "Speech enhancement using self-adaptation and multi-head self-attention," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 181-185.
S. Emura, H. Sawada, S. Araki, and N. Harada, "A frequency-domain BSS method based on l1 norm, unitary constraint, and cayley transform," in Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 111-115.
A. Aroudi, M. Delcroix, T. Nakatani, K. Kinoshita, S. Araki, and S. Doclo, "Cognitive-Driven Convolutional Beamforming Using EEG-Based Auditory Attention Decoding," in Proc. 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), 2020, pp. 1-6.
T. Nakatani, R. Ikeshita, K. Kinoshita, H. Sawada, and S. Araki, "Computationally Efficient and Versatile Framework for Joint Optimization of Blind Speech Separation and Dereverberation," in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 91-95.
K. Arai, S. Araki, A. Ogawa, K. Kinoshita, T. Nakatani, and T. Irino, "Predicting Intelligibility of Enhanced Speech Using Posteriors Derived from DNN-Based ASR System" in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 1156-1160.
T. Ochiai, M. Delcroix, Y. Koizumi, H. Ito, K. Kinoshita, and S. Araki, "Listen to What You Want: Neural Network-based Universal Sound Selector" in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 1441-1445.
K. Kinoshita, T. von Neumann, M. Delcroix, T. Nakatani, and R. Haeb-Umbach, "Multi-path RNN for Hierarchical Modeling of Long Sequential Data and its Application to Speaker Stream Separation," in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 2652-2656.
T. von Neumann, C. Boeddeker, L. Drude, K. Kinoshita, M. Delcroix, T. Nakatani, and R. Haeb-Umbach, "Multi-Talker ASR for an Unknown Number of Sources: Joint Training of Source Counting, Separation and ASR," in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 3097-3101.
A. Ogawa, N. Tawara, and M. Delcroix, "Language Model Data Augmentation Based on Text Domain Transfer," in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 4926-4930.
T. Moriya, T. Ochiai, S. Karita, H. Sato, T. Tanaka, T. Ashihara, R. Masumura, Y. Shinohara, and M. Delcroix, "Self-distillation for improving CTC-Transformer-based ASR systems," in Proc. the 21th Annual Conference of the International Speech Communication Association (Interspeech), 2020, pp. 546–550.

2019

論文

T. Nakatani and K. Kinoshita, "A unified convolutional beamformer for simultaneous denoising and dereverberation," IEEE Signal Processing Letters, vol. 26, no. 6, pp. 903-907, 2019.
R. Haeb-Umbach, S. Watanabe, T. Nakatani, M. Bacchiani, B. Hoffmeister, M. L. Seltzer, H. Zen, and M. Souden, "Speech processing for digital home assistants: Combining signal processing with deep-learning techniques," IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 111-124, 2019.
M. Hentschel, M. Delcroix, A. Ogawa, T. Iwata, and T. Nakatani, "Feature based domain adaptation for neural network language models with factorised hidden layers," IEICE Transactions on Information and Systems, vol. E102.D, no. 3, pp. 598-608, 2019.
K. Zmolikova, M. Delcroix, K. Kinoshita, T. Ochiai, T. Nakatani, L. Burget, and J. Cernocky, "SpeakerBeam: Speaker aware neural network for target speaker extraction in speech mixtures," IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 4, pp. 800-814, 2019.
K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, "Speech intelligibility prediction with the dynamic compressive Gammachirp filterbank and modulation power spectrum, " Acoustical Science and Technology, vol. 40, no. 2, pp. 84-92, 2019.

国際会議予稿

S. Araki, N. Ono, K. Kinoshita, and M. Delcroix, "Projection back onto filtered observations for speech separation with distributed microphone array," in Proc. CAMSAP 2019 - IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2019, pp. 291-295.
S. Karita, N. Chen, T. Hayashi, T. Hori, H. Inaguma, Z. Jiang, M. Someki, N. E. Y. Soplin, R. Yamamoto, X. Wang, S. Watanabe, T. Yoshimura, and W. Zhang, "A comparative study on transformer vs RNN in speech applications," in Proc. ASRU 2019 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019, pp. 449-456.
R. Ikeshita, N. Ito, T. Nakatani, and H. Sawada, "Independent low-rank matrix analysis with decorrelation learning," in Proc. WASPAA 2019 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019, pp. 288-292.
T. Nakatani, K. Kinoshita, R. Ikeshita, H. Sawada, and S. Araki, "Simultaneous denoising, dereverberation, and source separation using a unified convolutional beamformer," in Proc. WASPAA 2019 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019, pp. 224-228.
K. Arai, S. Araki, A. Ogawa, K. Kinoshita, T. Nakatani, K. Yamamoto, and T. Irino, "Predicting speech intelligibility of enhanced speech using phone accuracy of DNN-based ASR system," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 4275-4279.
A. Ogawa, M. Delcroix, S. Karita, and T. Nakatani, "Improved deep duel model for rescoring N-best speech recognition list using backward LSTMLM and ensemble encoders," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 3900-3904.
T. Ochiai, M. Delcroix, K. Kinoshita, A. Ogawa, and T. Nakatani, "Multimodal SpeakerBeam: Single channel target speech extraction with audio-visual speaker clues," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 2718-2722.
S. Karita, N. E. Y. Soplin, S. Watanabe, M. Delcroix, A. Ogawa, and T. Nakatani, "Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 1408-1412.
M. Delcroix, S. Watanabe, T. Ochiai, K. Kinoshita, S. Karita, A. Ogawa, and T. Nakatani, "End-to-end SpeakerBeam for single channel target speech recognition," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 451-455.
T. Nakatani and K. Kinoshita, "Simultaneous denoising and dereverberation for low-latency applications using frame-by-frame online unified convolutional beamformer," in Proc. Interspeech 2019 - the 20th Annual Conference of the International Speech Communication Association, 2019, pp. 111-115.
T. Nakatani and K. Kinoshita, "Maximum likelihood convolutional beamformer for simultaneous denoising and dereverberation," in Proc. EUSIPCO 2019 - the 27th European Signal Processing Conference (EUSIPCO), 2019, pp. 1-5.
R. Ikeshita, N. Ito, T. Nakatani, and H. Sawada, "A unifying framework for blind source separation based on a joint diagonalizability constraint," in Proc. EUSIPCO 2019 - the 27th European Signal Processing Conference (EUSIPCO), 2019, pp. 1-5.
H. Sawada, R. Ikeshita, N. Ito, and T. Nakatani, "Computational acceleration and smart initialization of full-rank spatial covariance analysis," in Proc. EUSIPCO 2019 - the 27th European Signal Processing Conference (EUSIPCO), 2019, pp. 1-5.
M. Hentschel, M. Delcroix, A. Ogawa, T. Iwata, and T. Nakatani, “A unified framework for feature-based domain adaptation of neural network language models," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 7250-7254.
A. Ogawa, T. Hirao, T. Nakatani, and M. Nagata, "ILP-based compressive speech summarization with content word coverage maximization and its oracle performance analysis," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 7190-7194.
T. Ochiai, M. Delcroix, K. Kinoshita, A. Ogawa, and T. Nakatani, "A unified framework for neural speech separation and extraction," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6975-6979.
M. Delcroix, K. Zmolikova, T. Ochiai, K. Kinoshita, S. Araki, and T. Nakatani, "Compact network for SpeakerBeam target speaker extraction," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6965-6969.
Y. Kubo, T. Nakatani, M. Delcroix, K. Kinoshita, and S. Araki, "Mask-based MVDR beamformer for noisy multisource environments: Introduction of time-varying spatial covariance model," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6855-6859.
J. Heymann, L. Drude, R. Haeb-Umbach, K. Kinoshita, and T. Nakatani, "Joint optimization of neural network-based WPE dereverberation and acoustic model for robust online ASR," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6655-6659.
S. Karita, S. Watanabe, T. Iwata, M. Delcroix, A. Ogawa, and T. Nakatani, "Semi-supervised end-to-end speech recognition using text-to-speech and autoencoders," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6166-6170.
S. Araki, N. Ono, K. Kinoshita, and M. Delcroix, "Estimation of sampling frequency mismatch between distributed asynchronous microphones under existence of source movements with stationary time periods detection," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 785-789.
N. Ito and T. Nakatani, "FastMNMF: Joint diagonalization based accelerated algorithms for multichannel nonnegative matrix factorization," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 371-375.
T. v. Neumann, K. Kinoshita, M. Delcroix, S. Araki, T. Nakatani, and R. Haeb-Umbach, "All-neural online source separation, counting, and diarization for meeting analysis," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 91-95.
S. Emura, S. Araki, T. Nakatani and N. Harada, "Distortionless beamforming optimized with l1-norm minimization," in Proc. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.

2018

論文

M. Delcroix, K. Kinoshita, A. Ogawa, C. Huemmer, and T. Nakatani, "Context adaptive neural network based acoustic models for rapid adaptation," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 5, pp. 895-908, 2018.
M. Tomiyama, K. Yamasaki, K. Arai, M. Inubushi, K. Yoshimura, and A. Uchida, "Effect of bandwidth limitation of optical noise injection on common-signal-induced synchronization in multimode semiconductor lasers," Optics Express, vol. 26, p. 13521, 05 2018.
S. Emura, S. Araki, T. Nakatani, and N. Harada, "Distortionless beamforming optimized with l1 norm minimization," IEEE signal processing letters, vol. 25, no. 7, pp. 936--940, July, 2018.

国際会議予稿

N. Ito and T. Nakatani, "Multiplicative updates and joint diagonalization based acceleration for under-determined BSS using a full-rank spatial covariance model," in Proc. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2018, pp. 231-235.
M. Hentschel, M. Delcroix, A. Ogawa, and T. Nakatani, "Feature-based learning hidden unit contributions for domain adaptation of RNN-LMs," in Proc. 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018, pp. 1692-1696.
M. Hentschel, M. Delcroix, A. Ogawa, T. Iwata, and T. Nakatani, "Factorised hidden layer based domain adaptation for recurrent neural network language models," in Proc. 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018, pp. 1940-1944.
T. Moriya, R. Masumura, T., Asami, Y. Shinohara, M. Delcroix, Y., Yamaguchi, and Y. Aono, "Progressive neural network-based knowledge transfer in acoustic models," in Proc. 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018, pp. 998-1002.
K. Yamamoto, T. Irino, S. Araki, K. Kinoshita, and T. Nakatani, "Speech intelligibility prediction using a multi-resolution Gammachirp envelope distortion index with common parameters for different noise conditions," in Proc. International Symposium on Universal Acoustical Communication, 2018.
N. Ito and T. Nakatani, "FastFCA-AS: Joint diagonalization based acceleration of full-rank spatial covariance analysis for separating any number of sources," in Proc. IWAENC 2018 - the 16th International Workshop on Acoustic Signal Enhancement (IWAENC), 2018, pp. 151-155.
S. Araki, N. Ono, K. Kinoshita, and M. Delcroix, "Comparison of reference microphone selection algorithms for distributed microphone array based speech enhancement in meeting recognition scenarios," in Proc. IWAENC 2018 - the 16th International Workshop on Acoustic Signal Enhancement (IWAENC), 2018, pp. 316-320.
Y. Matsui, T. Nakatani, M. Delcroix, K. Kinoshita, N. Ito, S. Araki, and S. Makino, "Online integration of DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming," in Proc. IWAENC 2018 - the 16th International Workshop on Acoustic Signal Enhancement (IWAENC), 2018, pp. 71-75.
J. Heymann, L. Drude, R. Haeb-Umbach, K. Kinoshita, and T. Nakatani, "Frame-online DNN-WPE dereverberation," in 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), 2018, pp. 466-470.
N. Ito, S. Araki, and T. Nakatani, "FastFCA: Joint diagonalization based acceleration of audio source separation using a full-rank spatial covariance model," in Proc. EUSIPCO 2018 - the 26th European Signal Processing Conference (EUSIPCO), 2018, pp. 1667-1671.
N. Ito, C. Schymura, S. Araki, and T. Nakatani, "Noisy cGMM: Complex Gaussian mixture model with non-sparse noise model for joint source separation and denoising," in Proc. EUSIPCO 2018 - the 26th European Signal Processing Conference (EUSIPCO), 2018, pp. 1662-1666.
L. Drude, C. Boeddeker, J. Heymann, R. Haeb-Umbach, M.Kinoshita, M. Delcroix and T. Nakatani, “Integrating neural network based beamforming and weighted prediction error dereverberation,” in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 3043-3047.
M. Delcroix, S. Watanabe, A. Ogawa, S. Karita, and T. Nakatani, "Auxiliary feature based adaptation of end-to-end ASR systems," in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 2444-2448.
T. Moriya, S. Ueno, Y. Shinohara, M. Delcroix, Y. Yamaguchi, and Y. Aono, “Multi-task learning with augmentation strategy for acoustic-to-word attention-based encoder-decoder Speech Recognition,” in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 2399-2403.
S. Watanabe, T. Hori, S. Karita, T. Hayashi, J. Nishitoba, Y. Unno, N. E. Y. Soplin, J. Heymann, M. Wiesner, N. Chen, A. Renduchintala, and T. Ochiai, "ESPnet: End-to-end speech processing toolkit," in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 2207-2211.
K. Yamamoto, T. Irino, N. Ohashi, S. Araki, K. Kinoshita, and T. Nakatani, "Multi-resolution Gammachirp envelope distortion index for intelligibility prediction of noisy speech," in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 1863-1867.
S. Karita, S. Watanabe, T. Iwata, A. Ogawa, and M. Delcroix, "Semi-supervised end-to-end speech recognition," in Proc. Interspeech 2018 - the 19th Annual Conference of the International Speech Communication Association, 2018, pp. 2-6.
F.-R. Stoter, A. Liutkus, and N. Ito, “The 2018 Signal Separation and Evaluation Campaign,” In Proc. LVA/ICA, 2018, pp. 293-305.
K. Zmolikova, M. Delcroix, K. Kinoshita, T. Higuchi, T. Nakatani, and J. Cernocky, "Optimization of speaker-aware multichannel speech extraction with ASR criterion," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 6702-6706.
A. Ogawa, M. Delcroix, S. Karita, and T. Nakatani, "Rescoring N-best speech recognition list based on one-on-one hypothesis comparison using encoder-classifier model," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 6099-6103.
T. Morioka, N. Tawara, T. Ogawa, A. Ogawa, T. Iwata, and T. Kobayashi, "Language model domain adaptation via recurrent neural networks with domain-shared and domain-specific representations," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 6084-6088.
S. Karita, A. Ogawa, M. Delcroix, and T. Nakatani, "Sequence training of encoder-decoder model using policy gradient for end-to-end speech recognition," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5839-5843.
S. Araki, N. Ono, K. Kinoshita, and M. Delcroix, "Meeting recognition with asynchronous distributed microphone array using block-wise refinement of mask-based MVDR beamformer," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5694-5698.
M. Delcroix, K. Zmolikova, K. Kinoshita, A. Ogawa, and T. Nakatani, "Single channel target speaker extraction and recognition with SpeakerBeam," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5554-5558.
K. Kinoshita, L. Drude, M. Delcroix, and T. Nakatani, "Listening to each speaker one by one with recurrent selective hearing networks," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5064-5068.
L. Drude, T. Higuchi, K. Kinoshita, T. Nakatani, and R. HaebUmbach, "Dual frequency- and block-permutation alignment for deep learning based block-online blind source separation," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 691-695.
N. Ito, T. Makino, S. Araki, and T. Nakatani, "Maximum-likelihood online speaker diarization in noisy meetings based on categorical mixture model and probabilistic spatial dictionary," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 546-550.
T. Higuchi, K. Kinoshita, N. Ito, S. Karita, and T. Nakatani, "Frame-by-frame closed-form update for mask-based adaptive MVDR beamforming," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 531-535.
J. Azcarreta, N. Ito, S. Araki, and T. Nakatani, "Permutation-free cGMM: Complex Gaussian mixture model with inverse Wishart mixture model based spatial prior for permutation-free source separation and source counting," in Proc. ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 51-55.
K. Arai, S. Shinohara, P. Davis, S. Sunada, and T. Harayama, "Chaotic laser based online physical random bit streaming system and its application to high-throughput encryption," in Proc. OFC 2018 - Optical Fiber Communication Conference, 2018, p. Tu3G.3.

2017

論文

T. Kawase, K. Niwa, M. Fujimoto, K. Kobayashi, S. Araki, and T. Nakatani, "Integration of spatial cue-based noise reduction and speech model-based source restoration for real time speech enhancement," IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E100. A (1027) , no. 5, pp. 1127-1136, May 2017.
A. Ogawa, and T. Hori, "Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks," Speech Communication, vol. 89, pp. 70-83 2017, May 2017.
T. Higuchi, N. Ito, S. Araki, T. Yoshioka, M. Delcroix, and T. Nakatani, "Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 4, pp. 780-793, April 2017.
S. Shinohara, K. Arai, P. Davis, S. Sunada, and T. Harayama, "Chaotic laser based physical random bit streaming system with a computer application interface," Optics Express, vol. 25, pp. 6461-6474, 2017.
N. Suzuki, T. Hida, M. Tomiyama, A. Uchida, K. Yoshimura, K. Arai, and M. Inubushi, "Common-signal-induced synchronization in semiconductor lasers with broadband optical noise signal," IEEE Journal of Selected Topics in Quantum Electronics, 2017.

書籍、解説記事

S. Watanabe, M. Delcroix, F. Metze and J. Hershey (eds) Springer "New Era for Robust Speech Recognition: Exploiting Deep Learning"

国際会議予稿

T. Higuchi, K. Kinoshita, M. Delcroix, and T. Nakatani, "Adversarial training for data-driven speech enhancement without parallel corpus," in Proc. ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017, pp. 40-47.
S. Araki, N. Ono, K. Kinoshita, and M. Delcroix, "Meeting recognition with asynchronous distributed microphone array," in Proc. ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017, pp. 32-39.
K. Zmolikova, M. Delcroix, K. Kinoshita, T. Higuchi, A. Ogawa, and T. Nakatani, "Learning speaker representation for neural network based multichannel speaker extraction," in Proc. ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017, pp. 8-15.
H. Ashikawa, N. Tawara, A. Ogawa, T. Iwata, T. Kobayashi, and T. Ogawa, "Exploiting end of sentences and speaker alternations in language modeling for multiparty conversations," in Proc. 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2017, pp. 1263-1267.
M. Hentschel, A. Ogawa, M. Delcroix, T. Nakatani, and Y. Matsumoto, "Exploiting imbalanced textual and acoustic data for training prosodically-enhanced RNNLMs," in Proc. 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2017, pp. 618-621.
D. Tran, M. Delcroix, A. Ogawa, and T. Nakatani, "Uncertainty decoding with adaptive sampling for noise robust DNN-based acoustic modeling," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 3852-3856.
K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, "Predicting speech intelligibility using a Gammachirp envelope distortion index based on the signal-to-distortion ratio," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 2949-2953.
K. Zmolikova, M. Delcroix, K. Kinoshita, T. Higuchi, A. Ogawa, and T. Nakatani, "Speaker-aware neural network based beamformer for speaker extraction in speech mixtures," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 2655-2659.
A. Ogawa, K. Kinoshita, M. Delcroix, and T. Nakatani, "Improved example-based speech enhancement by using deep neural network acoustic model for noise robust example search," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 1963-1967.
S. Karita, A. Ogawa, M. Delcroix, and T. Nakatani, "Forward-backward convolutional LSTM for acoustic modeling," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 1601-1605.
D. Tran, M. Delcroix, S. Karita, M. Hentschel, A. Ogawa, and T. Nakatani, "Unfolded deep recurrent convolutional neural network with jump ahead connections for acoustic modeling," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 1596-1600.
T. Higuchi, K. Kinoshita, M. Delcroix, K. Zmolikova, and T. Nakatani, "Deep clustering-based beamforming for separation with unknown number of sources," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 1183-1187.
K. Kinoshita, M. Delcroix, H. Kwon, T. Mori, and T. Nakatani, "Neural network-based spectrum estimation for online WPE dereverberation," in Proc. Interspeech 2017 - the 18th Annual Conference of the International Speech Communication Association, 2017, pp. 384-388.
D. Tran, M. Delcroix, A. Ogawa, C. Huemmer, and T. Nakatani, "Feedback connection for deep neural network-based acoustic modeling," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 5240-5244.
T. Ochiai, M. Delcroix, K. Kinoshita, A. Ogawa, T. Asami, S. Katagiri, and T. Nakatani, "Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 5175-7179.
T. Higuchi, T. Yoshioka, K. Kinoshita, and T. Nakatani, "Unsupervised utterance-wise beamformer estimation with speech recognition-level criterion," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 5170-5174.
C. Huemmer, M. Delcroix, A. Ogawa, K. Kinoshita, T. Nakatani, and W. Kellermann, "Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 4875-4879.
N. Ito, S. Araki, M. Delcroix, and T. Nakatani, "Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 681-685.
T. Nakatani, N. Ito, T. Higuchi, S. Araki, and K. Kinoshita, "Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 286-290.
K. Kinoshita, M. Delcroix, A. Ogawa, T. Higuchi, and T. Nakatani, "Deep mixture density network for statistical model-based feature enhancement," in Proc. ICASSP 2017 - 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 251-255.
N. Ito, S. Araki, and T. Nakatani, "Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming," in Proc. EUSIPCO 2017 - the 25th European Signal Processing Conference (EUSIPCO), 2017, pp. 1165-1169.
S. Araki, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita, T.Higuchi, T. Yoshioka, D. Tran, S. Karita, and T.Nakatani, "Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming," in Proc. HSCMA 2017 - the 5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2017, pp. 16-20.
A. Liutkus, F.-R. Stoter, Z. Rafii, D. Kitamura, B. Rivet, N. Ito, N. Ono, and J. Fontecave, "The 2016 Signal Separation Evaluation Campaign," in Proc. LVA/ICA 2017 - the 13th International Conference on Latent Variable Analysis and Signal Separation, 2017, pp. 323-332.
Y. Kawashima, S. Shinohara, S. Sunada, and T. Harayama, "Asymmetric emission of the quadrupole-deformed microcavity laser with spatially selective pumping," Workshop on Asymmetric Microcavity and Wave Chaos, 2017.
Y. Suzuki, S. Shinohara, S. Sunada, and T. Harayama, "Chiral mode lasing in an asymmetrically deformed microcavity laser," Workshop on Asymmetric Microcavity and Wave Chaos, 2017.
T. Harayama, S. Sunada, and S. Shinohara, "Universal single-mode lasing in fully-chaotic billiard lasers," Workshop on Asymmetric Mircrocavity and Wave Chaos, 2017.

2016

論文

A. Ogawa, T. Hori, and A. Nakamura, “Estimating speech recognition accuracy based on error type classification,” IEEE Trans. ASLP, vol. 24, no. 12, pp. 2400-2413, December 2016.
S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, and K. Arai, “Long-path formation in a deformed microdisk laser,” Physical Review A, vol. 94, 013831, July 2016.
S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, ”Signature of Wave Chaos in Spectral Characteristics of Microcavity Lasers,” Phys. Rev. Lett. 116, 203903, May 2016.
M. Delcroix, A. Ogawa, S.-J. Hahm, T. Nakatani, and A. Nakamura, “Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation,” Computer Speech and Language (CSL), Elsevier, vol. 36, pp. 24-41, March 2016.
K. Kinoshita, M. Delcroix, S. Gannot, E. Habets, R. Haeb-Umbach, W. Kellermann, V. Leutnant, R. Maas, T. Nakatani, B. Raj, A. Sehr and T. Yoshioka1, “A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research,” EURASIP journal on advanced signal processing, DOI:10.1186/s13634-016-0306-6, January 2016.

書籍、解説記事

1. 木下, 「音響キーワードブック」「6-1 残響除去」コロナ社 pp. 212-213 March 2016.

国際会議予稿

S. Watanabe, X. Xiao, and M. Delcroix, “Multi-Microphone Speech Recognition,” APSIPA December 2016.
T. Sasaki, I. Kakesu, A. Uchida, S. Sunada, K. Yoshimura, and K. Arai, “Common-signal-induced synchronization in photonic integrated circuits driven by constant-amplitude random-phase light,” NOLTA 2016, C1L-B4, vol. 1, pp. 566-569, November 2016.
T. Higuchi, T. Yoshioka, and T. Nakatani, “Sparseness-based multichannel nonnegative matrix factorization for blind source separation,” IWANC 2016, September 2016.
M. Fakhry, N. Ito, S. Araki, and T. Nakatani, “Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings,” IWAENC 2016, September 2016.
T. Higuchi, T. Yoshioka, and T. Nakatani, “Optimization of speech enhancement front-end with speech recognition-level criterion,” Interspeech 2016, September 2016.
A. Ogawa, S. Seki, K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, and K. Takeda, “Robust example search using bottleneck features for example-based speech enhancement,” Interspeech 2016, pp. 3733-3737, September 2016.
M. Delcroix, K. Kinoshita, A. Ogawa, T. Yoshioka, D. Tran, and T. Nakatani, “Context adaptive neural network for rapid adaptation of deep CNN based acoustic models,” Interspeech 2016, pp. 1573-1577, September. 2016.
D. Tran, M. Delcroix, A. Ogawa, and T. Nakatani, “Factorized linear input network for acoustic model adaptation in noisy conditions,” Interspeech 2016, pp. 3813-3817, September 2016.
K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, “Speech intelligibility prediction based on the envelope power spectrum model with the dynamic compressive Gammachirp auditory filterbank,” Interspeech 2016, pp. 2885-2889 September 2016.
M. Delcroix, and S. Watanabe, “Recent advances in distant speech recognition,” Interspeech 2016, September 2016.
K. Zmolikova, M. Karafiat, K. Vesel, M. Delcroix, S. Watanabe, L. Burget, and H. Cernock. “ Data selection by sequence summarizing neural network in mismatch condition training,” Interspeech 2016, September 2016.
Li. Li, H. Kameoka, T. Higuchi, and H. Saruwatari, “Semi-supervised joint enhancement of spectral and cepstral sequences of noisy speech,” Interspeech 2016, September 2016.
N. Ito, S. Araki, and T. Nakatani, “Complex angular central Gaussian mixture model for directional statistics in mask-based microphone array signal processing,” pp. 1153-1157 EUSIPCO-2016, August 2016.
N. Murata, H. Kameoka, K. Kinoshita, S. Araki, T. Nakatani, S. Koyama, and H. Saruwatari,“ Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution,” EUSIPCO 2016, pp. 1648-1652, August 2016.
S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, “Wave-chaos-induced single-frequency lasing in microcavities,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C1L-B2, 2016.
S. Orihara, K. Koyama, S. Shinohara, S. Sunada, T. Fukushima, and T. Harayama, “Optimal design of two-dimensional external cavities for delayed optical feedback,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper B3L-B3, 2016.
K. Kawashima, S. Shinohara, S. Sunada, T. Fukushima, and T. Harayama, “Asymmetric emission caused by chaos-assisted tunneling and synchronization in two-dimensional microcavity lasers,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C1L-B3, 2016.
S. Suzuki, S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, “Fast physical random bit generation by chaotic lasers with delayed feedback using extremely short external cavities,” Proceedings of the NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper B3L-B2, 2016.
S. Sekiguchi, S. Shinohara, T. Fukushima, and T. Harayama, “Effects of phase space sticky motions in nearly-integrable dielectric billiards on far-field patterns,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C2L-B5, 2016.
M. Fujimoto and T. Nakatani, "Multi-pass feature enhancement based on generative-discriminative hybrid approach for noise robust speech recognition," ICASSP 2016, pp. 5750-5754, March 2016.
T. Kawase, K. Niwa, M. Fujimoto, N. Kamado, K. Kobayashi, S. Araki, and T. Nakatani, "Real-time integration of statistical model-based speech enhancement with unsupervised noise psd estimation using microphone array," ICASSP 2016, pp. 604-608, March 2016.
S. Araki, M. Okada, T. Higuchi, A. Ogawa and T. Nakatani, "Spatial correlatoin model based observation vector clustering and MVDR beamforming for meeting recognition," ICASSP2016, pp. 385-389, 2016.
H. Meutzner, S. Araki, M. Fujimoto and T. Nakatani, "A generative-discriminative hybrid approach to multi-channel noise reduction for robust automatic speech recognition," ICASSP2016, pp. 5740-5744, 2016.
T. Yoshioka, K. Ohnishi, F. Fang, and T. Nakatani, “Noise robust speech recognition using recent developments in neural networks for computer vision,” ICASSP 2016, pp. 5730-5734, Mar. 2016.
N. Ito, S. Araki, and T. Nakatani, "Modeling audio directional statistics using a complex Bingham mixture model and its application to blind diffuse noise reduction," ICASSP2016, pp. 465-468, March 2016.
M. Delcroix, K. Kinoshita, C. Yu, A. Ogawa, T. Yoshioka, and T. Nakatani, “Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions,” Proc. of ICASSP’16, pp. 5270-5274, March 2016.
S. Kundu, G. V. Mantena, Y. Qian, T. Tan, M. Delcroix, and K. C. Sim, “Joint acoustic factor learning for robust deep neural network based automatic speech recognition,” Proc. of ICASSP’16, pp. 5025-5029, March 2016.
T. Higuchi, N. Ito, T. Yoshioka and T. Nakatani, "Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise," ICASSP 2016, pp. 5210-5214, March 2016.

その他会議予稿

K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, “Analysis of acoustic features for speech intelligibility prediction models,” 5th ASA/ASJ Joint meeting, Journal of the Acoustical Society America, vol. 140, No. 4, Pt. 2, pp. 3114, November 2016.
富山, 鈴木, 内田, 吉村, 新井, 犬伏, “波長フィルタを適用したノイズ光源により駆動された半導体レーザの共通信号入力同期,” 日本光学会年次学術講演会（Optics＆Photonics Japan 2016） , October 2016.
荒木, 木下, 伊藤, 小川, デルクロア, 樋口, 吉岡, チャン, 中谷, “雑音のある環境での複数人会話音声認識,” 日本音響学会2016年度秋季研究発表会講演論文集, pp. 1265-1268, September 2016. (招待講演)
伊藤，荒木，ファクフリ，中谷, “統計的空間辞書を用いた方向統計量モデルに基づく複数人会話における話者識別,” 日本音響学会2016年度秋季研究発表会講演論文集, pp. 321-324, September 2016.
山本, 入野, 松井, 荒木, 木下, 中谷, “音声明瞭度予測法dcGC-sEPSM の諸検討：評価用雑音の特性と予測精度への影響,” 日本音響学会2016年度秋季研究発表会講演論文集, pp. 663-666, September 2016.
森岡，岩田，小川，俵，小川，小林， “少量データに頑健なニューラルネットワーク言語モデル,” 日本音響学会2016年度秋季研究発表会講演論文集, pp. 89-92, September 2016.
芦川，森岡，小川，岩田，俵，小川，小林, “複数人対話のための話者情報を用いたRNN言語モデル,” 日本音響学会2016年度秋季研究発表会講演論文集, pp. 85-88, September 2016.
李，亀岡，樋口，猿渡，牧野， “音声のスペクトル領域とケプストラム領域における同時強調,” 電子情報通信学技術報告, vol. 116, no. 189, pp. 29-32, August 2016.
峯松，秋田，浅見，伊藤，落，郡山，齋藤，塩田，篠崎，鈴木，高木，俵，橋本，樋口，福田， “国際会議ICASSP2016参加報告,” 情報処理学会研究報告, vol. 2016-SLP-112, no.5, pp.1-6, July 2016.
村田, 亀岡, 木下, 荒木, 中谷, 小山, 猿渡, “非負値テンソル二重逆畳み込みによる残響環境下の劣決定音源分離,” 日本音響学会2016年度春季研究発表会講演論文集, pp. 623-626, March 2016.
中谷，伊藤，樋口，荒木，吉岡，藤本，木下, “NTT CHiME-3 音声認識システム：耐雑音フロントエンド,” 日本音響学会2016年度春季研究発表会講演論文集, pp. 57-60, March 2016.
M. Delcroix, “An overview of the 2015 Jelinek workshop on speech and language technology,” 電子情報通信学会技術研究報告, 応用音響 EA2015-66-EA2015-136, pp. 353-354, March 2016. (招待講演）
小川，関，木下，デルクロア，吉岡，中谷，武田， “DNN ボトルネック特徴量を用いた頑健な事例探索による事例ベース音声強調の高精度化と高速化,” 日本音響学会2016年度春季研究発表会講演論文集，pp. 633-634, March 2016.
鈴木, 樋田, 内田, 吉村, 新井, “帯域制限されたスーパールミネッセントダイオードによる半導体レーザ間の共通信号入力同期,” 日本光学会年次学術講演会 (Optics＆Photonics Japan 2016), January 2016.
砂田, 篠原, 福嶋, 原山, “カオスビリヤードレーザ：波動カオスのレーザ発振とそのスペクトル特性,” 日本物理学会2016年秋季大会, 15aAF-8, 2016.
小山, 折原, 篠原, 砂田, 福嶋, 原山,“遅延戻り光生成のための２次元外部キャビティデザインの最適化,” 日本物理学会2016年秋季大会, 15aAF-7, 2016.
小山, 原山, 篠原, 福嶋, 砂田, “遅延戻り光生成のための２次元マイクロキャビティ内での長距離周期光線軌道の安定性,” レーザー学会第488回研究会報告「レーザーのカオス・ノイズダイナミクスとその応用」, RTM-16-02, 2016.
関口, 篠原, 福嶋, 原山, “近可積分誘電体ビリヤードにおける生存及び位相空間確率分布の収束性,” レーザー学会第488回研究会報告「レーザーのカオス・ノイズダイナミクスとその応用」, RTM-16-03, 2016.
鈴木, 砂田, 篠原, 福嶋, 原山, “短い外部共振器を用いたカオスレーザーによる高速物理乱数生成,” レーザー学会第488回研究会報告「レーザーのカオス・ノイズダイナミクスとその応用」, RTM-16-06, 2016.
外崎, 砂田, 篠原, 原山, “D型・スタジアム型ビリヤードに関するリアプノフ指数の形状依存性及びMaxwell-Blochモデルによる発振モード数の形状依存性について,” 日本物理学会2016年秋季大会, 15aAF-6, 2016.
藤本, 中谷, "生成-識別ハイブリッドアプローチに基づく多段音声強調法を用いた雑音下音声認識," 日本音響学会2016年度春季研究発表会講演論文集, pp. 61-64, March 2016.
中谷, 伊藤, 樋口, 荒木, 吉岡, 藤本, 木下, "NTT CHiME-3 音声認識システム：耐雑音フロントエンド," 日本音響学会2016年度春季研究発表会講演論文集, pp. 57-60, March 2016.
川瀬, 丹羽, 藤本, 鎌土, 小林, 荒木, 中谷, "マイクロホンアレーによる実時間雑音 PSD 推定を用いたモデルベースの音声強調処理技術," 日本音響学会2016年度春季研究発表会講演論文集, pp. 654-656, March 2016.
K. Yamamoto, T. Irino, S. Araki, K. Kinoshita and T. Nakatani, "Study on predicting speech intelligibility of enhanced speech sounds using the dynamic compressive gammachirp auditory filterbank and modulation filterbank," 日本音響学会聴覚研究会, Oct., 2016.
荒木、岡田、樋口、小川、中谷、“時間周波数マスク推定に基づくMVDRビームフォーミングの会議音声認識への適用,”日本音響学会2016年春季研究発表会講演論文集, Mar. 2016.
川瀬、丹羽、藤本、鎌土、小林、荒木、中谷、“マイクロホンアレーによる実時間雑音PSD推定を用いたモデルベースの音声強調処理技術, ”日本音響学会2016年春季研究発表会講演論文集, Mar. 2016.
山本、入野、松井、荒木、木下、中谷、“動的圧縮型ガンマチャープフィルタバンクを用いた音声明瞭度予測法の改良,”日本音響学会聴覚研究会　Feb., 2016.
山本、入野、松井、荒木、木下、中谷、“強調音声のための明瞭度予測法の検証：聴取実験結果との比較,”日本音響学会2016年春季研究発表会講演論文集, Mar., 2016.
吉岡, デルクロア, 小川, ユー, 伊藤, 木下, 藤本, ファビアン, エスピ, 樋口, 荒木, 中谷, “NTT CHiME-3 音声認識システム：全体構成とバックエンド,” 日本音響学会 2016年春季研究発表会講演論文集, Mar. 2016.
伊藤, 荒木, 中谷, "混合複素ビンガム分布を用いた方向統計量モデルとブラインド拡散性雑音除去," 日本音響学会2016年度春季研究発表会講演論文集, pp. 629-630, March 2016.
デルクロア，木下，Yu ，小川，吉岡，中谷, “音響コンテキスト適応型DNN に基づく高速音響モデル適応,” 日本音響学会春季研究発表会，pp. 149-150, March 2016.
M. Delcroix, “An overview of the 2015 Jelinek workshop on speech and language technology,” 電子情報通信学会技術研究報告、応用音響EA2015-66-EA2015-136, pp. 353-354, March 2016. (招待講演)
木下, デルクロア, 小川, 中谷, "発話内容を補助情報として用いたDNN 型高精度音声強調", 日本音響学会2016年春季研究発表会, pp.631-632, March 2016.
李，亀岡，樋口，猿渡，"ケプストラム距離正則化半教師あり NMF による音声強調," 日本音響学会2016年度春季研究発表会講演論文集, pp. 721-724, March 2016.
樋口，伊藤，吉岡，中谷，"時間周波数マスク推定に基づくオンライン MVDR ビームフォーミング," 日本音響学会2016年度春季研究発表会講演論文集, pp. 559-560, March 2016.

2015

論文

M. Espi, M. Fujimoto, K. Kinoshita, and T. Nakatani, "Exploiting spectro-temporal locality in deep learning based acoustic event detection," EURASIP Journal on Audio, Speech, and Music Processing, DOI 10.1186/s13636-015-0069-2December 2015.
M. Espi, M. Fujimoto, and T. Nakatani, "Acoustic event detection in speech overlapping scenarios based on high resolution spectral input and deep learning," IEICE Transactions on Information and Systems, vol. E98-D, no. 10, pp. 1799-1807, October 2015.
T. Harayama and S. Shinohara, “Ray-wave correspondence in chaotic dielectric billiards,” Physical Review E, vol. 92, p. 042916 (6 pages), 2015
T. Yoshioka and M. J. F. Gales, “Environmentally robust ASR front-end for deep neural network acoustic models,” Computer Speech and Language, vol. 31, no. 1, pp. 65-86, May 2015.
N. Ito, E. Vincent, T. Nakatani, N. Ono, S. Araki, and S. Sagayama, "Blind suppression of nonstationary diffuse acoustic noise based on spatial covariance matrix decomposition," Springer Journal of Signal Processing Systems, vol. 79, no. 2, pp. 145-157, May 2015.（招待論文）
M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, S. Araki, T. Hori, and T. Nakatani, “Strategies for distant speech recognition in reverberant environments,” EURASIP Journal on Advances in Signal Processing, July 2015.
M. Inubushi, K. Yoshimura, K. Arai, and Peter Davis, "Physical random bit generators and their reliability: focusing on chaotic laser systems", Nonlinear Theory and Its Applications, vol. 6, issue 2, pp. 133-143, 2015.

報道発表

公共エリア雑音下でのモバイル音声認識の国際技術評価で、世界1位の精度を達成～ひずみなし音声強調とディープラーニング新技術により音声認識を高精度化～, 2015. 12.14.

書籍、解説記事

大庭, 小林, 植松, 浅見, 丹羽, 鎌土, 川瀬, 堀, "ビジネスシーンにおけるサポートを実現するメディア処理技術," NTT技術ジャーナル, vol. 27, no. 2, Feburary 2015.
T. Oba, K. Kobayashi, H. Uematsu, T. Asami, K. Niwa, N. Kamado, T. Kawase, and T. Hori, "Media Processing Technology for Business Task Support," NTT Technical Review, vol. 13, no. 4, April 2015.
荒木, 藤本, 吉岡, デルクロア, エスピ, 中谷, "ディープラーニングを用いた実環境における遠隔発話音声処理", NTT技術ジャーナル、2015年9月号, 2015.
S. Araki, M. Fujimoto, T. Yoshioka, M. Delcroix, M. Espi, and T. Nakatani, "Deep learning based distant talk speech processing in real world sound environments, " NTT Technical Review, 2015.
吉岡, “ディープニューラルネットワークの音声認識への応用,” 日経エレクトロニクス(編) 人工知能テクノロジー総覧：ディープラーニング・脳応用・ハードウェア化の最前線, Sep. 2015.
伊藤, 荒木, 中谷, "どんな環境でも聞きたい音を聞き分ける～音響学における未解決問題～," ネイチャーインタフェイス, no. 65, pp. 14-16, December 2015.（招待論文）
伊藤, 荒木, 中谷, "どんな環境でも聞きたい音を聞き分ける," 日本音響学会誌, vol. 71, no. 3, pp. 136-142, March 2015.（招待論文）
吉村，砂田，新井, "共通ランダム信号による同期現象：位相縮約による理論解析と半導体レーザーにおける実験", レーザー研究, 第43巻, 第6号, pp. 376-380, 2015.
新井, デイビス, "物理乱数生成に関する最近の動向", システム制御学会誌（システム／制御／情報）,　第58巻, 第11号, Nobember 2014.

国際会議予稿

M. Espi, M. Fujimoto, K. Kinoshita, and T. Nakatani, "On the importance of feature extraction for acoustic event detection using deep neural networks," Interspeech 2015, pp. 2922-2926, September 2015.
M. Fujimoto and T. Nakatani, "Feature enhancement based on generative-discriminative hybrid approach with GMMs and DNNs for noise robust speech recognition," ICASSP 2015, pp. 5019-5023, April 2015.
D. Q. Truong, S. Nakamura, M. Delcroix, and T. Hori, "WFST-Based Structural Classification Integrating DNN Acoustic Features and RNN Language Features for Speech Recognition," ICASSP 2015, pp. 4959-4963, April 2015.
S. Araki, T. Hayashi, M. Delcroix, M. Fujimoto, K. Takeda and T. Nakatani, "Exploring multi-channel features for denoising-autoencoder-based speech enhancement,"ICASSP2015, pp. 116-120, Apr. 2015.
T. Yoshioka, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita, M. Fujimoto, C. Yu, W. J. Fabian, M. Espi, T. Higuchi, S. Araki, T. Nakatani, “The NTT CHiME-3 system: advances in speech enhancement and recognition for mobile multi-microphone devices,” ASRU 2015, pp. 436-443, Dec. 2015.
T. Yoshioka, S. Karita, and T. Nakatani, “Far-field speech recognition using CNN-DNN-HMM with convolution in time,” ICASSP 2015, pp. 4360-4364, Apr. 2015.
N. Ono, Z. Rafii, D. Kitamura, N. Ito, and A. Liutkus, "The 2015 signal separation evaluation campaign," LVA/ICA2015, pp. 387-395, August 2015.
N. Ito, S. Araki, and T. Nakatani, "Permutation-free clustering of relative transfer function features for blind source separation," EUSIPCO2015, pp. 409-413, September 2015.
M. Delcroix, K. Kinoshita, T. Hori, and T. Nakatani, “Context adaptive deep neural networks for fast acoustic model adaptation,” Proc. of ICASSP’15, pp. 4535–4539, April 2015.
K. Kinoshita, M. Delcroix, A. Ogawa, T. Nakatani, ``Text-informed speech enhancement with deep neural networks,'' Interspeech, pp.1760-1764, 2015
K. Kinoshita, T. Nakatani, ``Modeling inter-node acoustic dependencies with Restricted Boltzmann Machine for distributed microphone array based BSS,'' ICASSP, pp. 464-468, 2015
N. Suzuki, T. Hida, I. Kakesu, A. Uchida, K. Yoshimura, and K. Arai, "Effect of the bandwidth limitation of an optical noise signal used for common-signal induced synchronization in chaotic semiconductor lasers", XXXV Dynamics Days Europe 2015, September 2015.
C. Yu, A. Ogawa, M. Delcroix, Takuya Yoshioka, Tomohiro Nakatani, and John H.L. Hansen, "Robust i-vector extraction for neural network adaptation in noisy environment," Proc. Interspeech, pp. 2854-2857, 2015.
A. Ogawa and T. Hori, "ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks," Proc. IEEE ICASSP, pp. 4370-4374, 2015.
K. Aoyama, A. Ogawa, T. Hattori, and T. Hori, "Double-layer neighborhood graph based similarity search for fast query-by-example sopken term detection," Proc. IEEE ICASSP, pp. 5216-5220, 2015.

その他会議予稿

藤本，中谷，"生成‐識別ハイブリッドアプローチに基づく音声強調手法を用いた雑音下音声認識," 日本音響学会2015年度春季研究発表会講演論文集, pp. 43-46, March 2015.
荒木、林、デルクロア、藤本、武田、中谷、"マルチチャネル特徴を用いた denoising autoencoder による音声強調," 日本音響学会2015年春季研究発表会講演論文集, Mar. 2015.
山本，入野，荒木，木下，中谷，“動的圧縮型ガンマチャープフィルタバンクを用いた強調音声の明瞭度予測法の提案,” 日本音響学会2015年秋季研究発表会講演論文集 Sept., 2015.
吉岡, デルクロア, 藤本, 中谷, “DNN音響モデルにおける特徴量抽出の諸相,” 電子情報通信学会技術研究報告, vol. 115, no. 146, SP2015-46, pp. 61-65, Jul. 2015.
伊藤, 荒木, 中谷, "クラスタリングに基づく音源分離と線形予測に基づく残響除去の確率論的モデル統合," 日本音響学会2015年度春季研究発表会講演論文集, pp. 547-548, March 2015.
伊藤, 荒木, 中谷, "パーミュテーションフリークラスタリングに基づくマルチチャネル雑音除去," 日本音響学会2015年度秋季研究発表会講演論文集, pp. 587-588, Sept. 2015.
森岡，俵，小川，岩田，小川，堀，小林，"複数の文脈長を考慮したリカレントニューラルネットワークに基づく言語モデル，" 日本音響学会2015年度秋季研究発表会講演論文集，1-2-7, Sept. 2015.

2014

論文

S. Shinohara, S. Sunada, T. Fukushima, T. Harayama, K. Arai, and K. Yoshimura, “Efficient optical path folding by using multiple total internal reflections in a microcavity,” Applied Physics Letters, vol. 105, p.151111 (4 pages), 2014.
T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Sakaguchi, and Y. Tokuda, “Lasing of TM modes in a two-dimensional GaAs microlaser,” Optics Express, vol. 22, pp.11912-11917, 2014.
S. Sunada, T. Fukushima, S. Shinohara, T. Harayama, K. Arai, and M. Adachi, “A compact chaotic laser device with a two-dimensional external cavity structure,” Applied Physics Letters, vol. 104, p.241105 (4 pages), 2014．
S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, K. Arai, and K. Yoshimura, “Anticorrelated bidirectional output from quasistadium-shaped semiconductor microlasers,” Optical Review, vol. 21, pp.113-116, 2014.
丸山, 荒木, 中谷, 宮部, 山田, 牧野, 中村,　"周波数依存到来時間差推定に基づく劣決定ブラインド音源分離の高速化,"日本音響学会論文誌, Vol. 70, No. 6, pp. 323-331, June 2014.
T. Otsuka, K. Ishiguro, T., H. Sawada, and H. G. Okuno, “Multichannel Sound Source Dereverberation and Separation for Arbitrary Number of Sources Based on Bayesian Nonparametrics,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 22, no. 12, pp. 2218-2232, Oct. 2014.
伊藤, 荒木, 木下, 中谷, "音源位置情報に基づく劣決定ブラインド音源分離のためのパーミュテーションフリークラスタリング法," 電気情報通信学会論文誌A, vol. J97-A, no. 4, pp. 234-246, April 2014.（招待論文）
S. Sunada, K. Arai, K. Yoshimura, and M. Adachi, "Optical Phase Synchronization by Injection of Common Broadband Low-Coherent Light", Physical Review Letters, Vol. 112, 204101, May 2014.
R. Takahashi, Y. Akizawa, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Fast physical random bit generation with photonic integrated circuits with different external cavity lengths for chaos generation", Optics Express, vol. 22, pp. 11727-11740, May 2014.
S. Yamahata, Y. Yamaguchi, A. Ogawa, H. Masataki, O. Yoshioka, and S. Takahashi, "Automatic vocabulary adaptation based on semantic and acoustic similarities," IEICE Trans. Inf. & Syst. Vol. E97-D, No.6, pp.1488-1496, June 2014.

書籍、解説記事

Y. Iwata, T. Nakatani, T. Yoshioka, M. Fujimoto, and H. Saito, "Maximum a posteriori spectral estimation with source log-spectral priors for multichannel speech enhancement," in "Advances in Speech and Audio Processing for Coding, Enhancement and Recognition," pp. 281-317, Springer, October 2014.

国際会議予稿

M. Fujimoto, Y. Kubo, and T. Nakatani, "Unsupervised non-parametric Bayesian modeling of non-stationary noise for model-based noise suppression," ICASSP 2014, pp. 5562-5566, May 2014.
T. Hori, Y. Kubo, and A. Nakamura, "Real-time one-pass decoding with recurrent neural network language model for speech recognition," ICASSP 2014, pp. 6364-6368, May 2014.
M. Espi, M. Fujimoto, Y. Kubo, and T. Nakatani, "Spectrogram patch based acoustic event detection and classification in speech overlapping conditions," in Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Array (HSCMA 2014), pp. 117-121, May 2014.
Y. Kubo, J. Suzuki, T. Hori, A. Nakamura, "Restructuring Output Layers of Deep Neural Networks Using Minimum Risk Parameter Clustering," Interspeech 2014, pp. 1068-1072, September 2014.
T. Yoshioka, A. Ragni, and M. J. F. Gales, “Investigation of unsupervised adaptation of DNN acoustic models with filter bank input,” ICASSP 2014, pp. 6344-6348, May 2014.
T. Yoshioka, X. Chen, and M. J. F. Gales, “Impact of single-microphone dereverberation on DNN-based meeting transcription systems,” ICASSP 2014, pp. 5527-5531, May 2014.
N. Ito, S. Araki, T. Yoshioka, and T. Nakatani, "Relaxed disjointness based clustering for joint blind source separation and dereverberation," IWAENC2014, pp. 268-272, September, 2014.
N. Ito, S. Araki, and T. Nakatani, "Probabilistic integration of diffuse noise suppression and dereverberation," ICASSP2014, pp. 5167-5171, May 2014.
M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, T. Nakatani, and A. Nakamura, “Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge,” proc. of REVERB challenge workshop, May 2014. (Best performance on the recognition task of the REVERB Challenge)
M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, S. Araki, and T. Nakatani, “Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition,” Invited paper to IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 522-526 December 2014.
K. Arai, "Synchronization of semiconductor lasers for secret key distribution", Forum Math-for-Industry 2014, Octorber 2014.
I. Kakesu, N. Suzuki, A. Uchida, K. Yoshimura, K. Arai, and Peter Davis, "Frequency Dependence of Common-Signal-Induced Synchronization in Semiconductor Lasers with Constant-Amplitude and Random-Phase Light", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 466-469, September 2014.
S. Sunada, K. Arai, K. Yoshimura, and M. Adachi, "Common Noise-Induced Optical Phase Synchronization in Lasers", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 470-473, September 2014.
K. Arai, K. Yoshimura, S. Sunada, and A. Uchida, "Synchronization Induced by Common ASE Noise in Semiconductor Lasers", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 474-477, September 2014.
A. Ogawa, K. Kinoshita, T. Hori, T. Nakatani, and A. Nakamura, "Fast segment search for corpus-based speech enhancement based on speech recognition technology," Proc. IEEE ICASSP, pp. 1576-1580, 2014.
K. Aoyama, A. Ogawa, T. Hattori, T. Hori, and A. Nakamura, "Zero-resource spoken term detection using hierarchical graph-based similarity search," Proc. IEEE ICASSP, pp. 7143-7147, 2014.

その他会議予稿

M. Espi，M. Fujimoto，and T. Nakatani, "Detection and classification of acoustic events using multiple resolution spectrogram patch models," in Proceedings of ASJ Autumn Meeting, pp. 1529-1530, September 2014.
M. Espi, M. Fujimoto, Y. Kubo, and T. Nakatani, "A spectrogram-patch-input DNN model for detection and classification of acoustic events robust to speech overlapping scenarios," 音学シンポジウム2014, April 2014.
鎌土, 浅見, 藤本, 木下, 青野, 政瀧, 阪内, "主話者音声区間検出への雑音抑圧法と残響除去法の応用," 日本音響学会2014年度春季研究発表会, pp. 25-28, March 2014.
堀, 久保, 中村, "リカレントニューラルネットワーク言語モデルを用いた実時間ワンパスデコーディングの検討," 日本音響学会2014年春季研究発表会, pp. 61-62, March 2014.
篠田, 堀, 堀, 篠崎, "「音声認識」は今後こうなる！" 情報処理学会研究報告 vol. 2014-SLP-100 No. 2, 2014.
荒木, 堀, 中谷, "会話シーン分析の複数人自由会話音声認識における音声強調," 電子情報通信学会技術報告, vol. 114, no. 274, EA2014-25, pp. 9-14, Oct., 2014.
伊藤, 荒木, 中谷, "確率的モデル統合に基づく拡散性雑音と残響の同時ブラインド抑圧," 日本音響学会2014年度春季研究発表会講演論文集, pp. 667-668, March 2014.
伊藤, 荒木, 中谷, "音源位置と音源アクティビティに基づく残響下・劣決定条件での音源数推定と音源分離," 日本音響学会2014年度秋季研究発表会講演論文集, pp. 521-522, September 2014.
デルクロア、木下、吉岡、小川、久保、藤本、伊藤、エスピ、堀、中谷、中村, “残響下音声認識のための音声強調・認識技術：REVERBチャレンジにおけるNTT提案システムについて,” 音声研究会・音声言語情報処理研究会　合同研究会, 2014. (招待講演)
木下、デルクロ、吉岡、中谷, “REVERB challenge（残響下音声強調・認識チャレンジ)：企画概要と結果報告,” 日本音響学会2014年春季研究発表会、pp.655-658, March 2014
小川，堀，"Bidirectional RNNを用いた音声認識誤り検出と認識率推定，" 日本音響学会2014年度秋季研究発表会講演論文集，1-8-9, Sept. 2014.
小川，木下，堀，中谷，中村，"事例の構造化表現に基づく事例ベース音声強調の高速化，" 日本音響学会2014年春季研究発表会，2-1-1, March 2014.

2013

論文

M. Delcloix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S. Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.-J. Hahm, and A. Nakamura, "Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral & temporal modeling of sounds," Computer Speech and Language (CSL), vol. 27, issue 3, pp. 851-873, May 2013.
[Sound Demo] Demonstration of the results obtained for the PASCAL 'CHiME' Challenge
M. Delcroix, S. Watanabe, T. Nakatani, and A. Nakamura, "Cluster-based dynamic variance adaptation for interconnecting speech enhancement pre-processor and speech recognizer," Computer Speech and Language, Elsevier, vol. 27, issue 1, pp. 350-368, January 2013.
T. Nakatani, S. Araki, T. Yoshioka, M. Delcroix, and M. Fujimoto, "Dominance based integration of spatial and spectral features for speech enhancement," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 12, pp. 2516-2531, December 2013.
T. Yoshioka and T. Nakatani, "Noise model transfer: novel approach to robustness against nonstationary noise," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2182-2192, October 2013.
M. Souden, K. Kinoshita, M. Delcroix, and T. Nakatani, "Location feature integration for clustering-based speech separation in distributed microphone arrays," IEEE Transactions on Audio, Speech, and Language Processing, 2013.
M. Souden, S. Araki, K. Kinoshita, T. Nakatani, and H. Sawada, "A multichannel MMSE-based framework for speech source separation and noise reduction," IEEE Transactions on Audio Speech and Language Processing, vol. 21, no. 9, pp. 1913-1928, September 2013.
M. Souden, K. Kinoshita, and T. Nakatani, "Towards online maximum likelihood speech clustering and separation," Journal of Acoust. Soc. America (JASA) Express letter, vol. 133, no. 5, pp. EL339-EL345, 2013.
H. Sawada, H. Kameoka, S. Araki, and N. Ueda, "Multichannel extensions of nonnegative matrix factorization with complex-valued data," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 971-982, May 2013.
M. Suzuki, T. Yoshioka, S. Watanabe, N. Minematsu, and K. Hirose, "Feature enhancement with joint use of consecutive corrupted and noise feature vectors with discriminative region weighting," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no, 10, pp. 2172-2181, October 2013.
J. Muramatsu, "Channel coding and lossy source coding using a generator of constrained random numbers," to appear in IEEE Transactions on Information Theory, 2013.
J. Muramatsu and S. Miyake, "Corrections to "Hash property and coding theorems for sparse matrices and maximum-likelihood coding,"" IEEE Transactions on Information Theory, vol.IT-59, no. 10, pp. 6952-6953, October 2013.
J. B. Goette, S. Shinohara, and M. Hentschel, "Are fresnel filtering and the angular Goos-Haenchen shift the same?," Journal of Optics 15 p. 014009, 2013.
S. Sunada, T. Fukushima, S. Shinohara, T. Harayama, and M. Adachi, "Stable single-wavelength emission from stadium-shaped chaotic microcavity lasers," Physical Review A 88, p. 013802, 2013.
T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Arai, K. Sakaguchi, and T. Tokuda, "Selective excitation of the lowest-order transverse ring modes in a quasi-stadium laser diode," Optics Letters 38, pp. 4158-4161, 2013.
H. Koizumi, S. Morikatsu, H. Aida, T. Nozawa, I. Kakesu, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Information-theoretic secure key distribution based on common random-signal induced synchronization in unidirectionally-coupled cascades of semiconductor lasers," Optics Express, vol. 21, no. 15, pp. 17869-17893, July 2013.

書籍、解説記事

堀, "意図を理解する音声認識技術," 電子情報通信学会誌, 平成２５年１１月号　小特集「携帯電話の聞く・聞かせる技術」
堀, 荒木, 中谷, 中村, "みんなの会話を聞き取るコンピュータを目指して," NTT技術ジャーナル, September 2013.
久保, 小川, 堀, 中村, "音声と言語の一体型学習に基づく音声認識技術," NTT技術ジャーナル, vol. 25, no. 9, September 2013.
堀, 荒木, 久保, 小川, 大庭, 中村, "複数人会話音声認識技術の最前線～みんなの会話を聞き取るコンピュータを目指して～," 日経エレクトロニクス, 2013.10.14号, pp. 71-81, 2013.
T. Hori, S. Araki, T. Nakatani, and A. Nakamura, "Advances in multi-speaker conversational speech recognition and understanding," NTT Technical Review, vol. 11, no. 12, December 2013.
T. Hori and A. Nakamura, "Speech recognition algorithms using weighted finite-state transducers," Morgan & Claypool Publishers, January 2013.
Y. Kubo, A. Ogawa, T. Hori, and A. Nakamura, "Speech recognition based on unified model of acoustic and language aspects of speech," NTT Technical Review, vol. 11, no. 12, December 2013.
H. Masataki, T. Asami, S. Yamahata, and M. Fujimoto, "Speech recognition technology that can adapt to changes in service and environment," NTT Technical Review, vol. 11 no. 7, July 2013.
政瀧, 浅見, 山畠, 藤本, "サービスや利用環境の変化に柔軟に対応する音声認識技術," NTT技術ジャーナル, vol. 25, no. 3, pp. 16-20, March 2013.
A. Uchida, H. Koizumi, I. Kakesu, K. Yoshimura, J. Muramatsu, and P. Davis, "Synchronized semiconductor lases for secure key distribution," SPIE Newsroom, 10.1117/2.1201311.005200, 2013.
村松, 吉村, デイビス, 内田, 原山, "Bounded observability に基づく秘密鍵配送," 学会誌「応用数理」23, no. 1, pp. 11-20, 2013.
村松, "Slepian-Wolf の定理―相関のある情報源の分散符号化," IEICE Fundamentals Review, vol. 7, no. 3, pp. 227-241, January 2014.

国際会議予稿

A. Ogawa, T. Hori, and A. Nakamura, "Discriminative recognition rate estimation for n-best list and its application to n-best rescoring," ICASSP2013, pp. 6832-6836, 2013.
A. Ogawa, T. Hori, A. Nakamura, and T. Oba, "Recognition rate estimation based on error type classification and its applications," Invited Talk at Workshop Errare 2013.
M. Fujimoto and T. Nakatani, "Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noise," Interspeech2013, pp. 2982-2986, August 2013.
M. Delcroix, A. Ogawa, S.-J. Hahm, T. Nakatani, and A. Nakamura, "Unsupervised discriminative adaptation using differenced maximum mutual information based linear regression," ICASSP2013, pp. 7888-7892, 2013.
M. Delcroix, Y. Kubo, T. Nakatani, and A. Nakamura, "Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling?" Interspeech2013, pp. 2992-2996, 2013.
K. Aoyama, A. Ogawa, T. Hattori, T. Hori, and A. Nakamura, "Graph index based query-by-example search on a large speech data set," ICASSP2013, pp. 8520-8524, 2013.
Y. Kubo, T. Hori, and A. Nakamura, "A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion," Interspeech2013.
T. Oba, A. Ogawa, T. Hori, H. Masataki, and A. Nakamura, "Unsupervised discriminative language modeling using error rate estimator," Interspeech2013, pp. 1223-1227, 2013.
Y. Kubo, T. Hori, and A. Nakamura, "Large vocabulary continuous speech recognition based on WFST structured classifiers and deep bottleneck features," ICASSP2013, pp. 7629-7633, 2013.
S.-J. Hahm, A. Ogawa, M. Delcroix, M. Fujimoto, T. Hori, and A. Nakamura, "Feature space variational Bayesian linear regression and its combination with model space VBLR," ICASSP2013, pp. 7898-7902, 2013.
T. Yoshioka and T. Nakatani, "Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition," ICASSP2013, pp. 7058-7062, May 2013.
T. Yoshioka and T. Nakatani, "Dereverberation for reverberation-robust microphone arrays," Proc. 21th European Signal Processing Conference (EUSIPCO 2013), September 2013.
T. Nakatani, M. Souden, S. Araki, T. Yoshioka, T. Hori, and A. Ogawa, "Coupling beamforming with spatial and spectral feature based spectral enhancement and its application to meeting recognition," ICASSP2013, pp. 7249-7253, May 2013.
T. Nakatani, M. Delcroix, and M. Fujimoto, "Speech enhancement in a car using spatial and spectral models for speaker and noise," in Proc. of The 6th Biennial Workshop on Digital Signal Processing for In-Vehicle Systems, September 2013.
K. Kinoshita, M. Souden, and T. Nakatani, "Blind source separation using spatially distributed microphones based on microphone-location dependent source activities," Interspeech2013, pp. 822-826, August 2013.
K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, E. Habets, R. Haeb-Umbach, V. Leutnant, A. Sehr, W. Kellermann, R. Maas, S. Gannot, and B. Raj, "The REVERB challenge: a common evaluation framework for dereverberation and recognition of reverberant speech," WASPAA, October 2013.
K. Kinoshita and T. Nakatani, "Microphone-location dependent mask estimation for BSS using spatially distributed asynchronous microphones," 2013 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS), pp. 326-331, November 2013.
N. Ito, S. Araki, and T. Nakatani, "Permutation-free convolutive blind source separation via full-band clustering based on frequency-independent source presence priors," ICASSP2013, pp. 3238-3242, May 2013.
M. Souden, K. Kinoshita, and T. Nakatani, "An integration of source location cues for speech clustering in distributed microphone arrays," ICASSP2013, pp. 111-115, May 2013.
R. Maas, W. Kellermann, A. Sehr, T. Yoshioka, M. Delcroix, K. Kinoshita, and T. Nakatani, "Formulation of the REMOS concept from an uncertainty decoding perspective," in Proc. of the international conference on digital signal processing (IEEE), pp. 1-6, July 2013.
A. Sehr, T. Yoshioka, M. Delcroix, K. Kinoshita, T. Nakatani, R. Maas, and W. Kellermann, "Conditional emission densities for interconnecting speech enhancement and recognition systems," Interspeech2013, pp. 3502-3506, 2013.
I. Jafari, N. Ito, M. Souden, S. Araki, and T. Nakatani, "Source number estimation based on clustering of speech activity sequences for microphone array processing," Proc. IEEE International Workshop on Machine Learning for Signal Processing (IEEE MLSP), September 2013.
Y. Uezu, K. Kinoshita, M. Souden, and T. Nakatani, "On the robustness of distributed EM based BSS in asynchronous distributed microphone array scenarios," Interspeech2013, pp. 3298-3302, August 2013.
N. Ono, Z. Koldovsky, S. Miyabe, and N. Ito, "The 2013 signal separation evaluation campaign," Proc. IEEE International Workshop on Machine Learning for Signal Processing (IEEE MLSP), September 2013.
J. Muramatsu, "Equivalence between inner regions for broadcast channel coding," The Proceedings of the 2013 IEEE Information Theory Workshop, pp. 164-168, 2013.
J. Muramatsu, "Channel code using a constrained random number generator," The Proceedings of the 2013 IEEE International Symposium on Information Theory, pp. 2463-2467, 2013.
J. Muramatsu, "Lossy source code using a constrained random number generator," The Proceedings of the 2013 IEEE International Symposium on Information Theory, pp. 2354-2358, 2013.
K. Yoshimura, J. Muramatsu, K. Arai, S. Shinohara, and A. Uchida, "Synchronization of semiconductor lasers by injection of common broadband random light," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 449-452, 2013.
K. Yoshimura, "Existence and stability of discrete breathers in Fermi-Pasta-Ulam lattices," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 274-277, 2013.
K. Arai, S. Shinohara, S. Sunada, K. Yoshimura, T. Harayama, and A. Uchida, "Noise effects on generalized chaos synchronization in semiconductor lasers," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 413-416, 2013.
S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, K. Arai, and K. Yoshimura, "Nonlinear modal dynamics in two-dimensional cavity microlasers," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 409-412. 2013.
T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Sakaguchi, and Y. Tokuda, "Ray dynamical simulation of penrose unilluminable room cavity," Frontiers in Optics 2013/Laser Science XXIX, October 2013.
I. Kakesu, H. Koizumi, S. Morikatu, H. Aida, T. Nozawa, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Secure key distribution using common-signal-induced synchronization in cascaded semiconductor lasers," Proc. of Frontiers in Optics, 2013.
R. Takahashi, Y. Akizawa, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and Peter Davis, "Physical random number generation using photonic integrated circuit with mutually coupled semiconductor lasers," Frontiers in Optics 2013, October 8-12, 2013.

その他会議予稿

堀, 中村, "動的言語モデルを用いるワンパス WFST デコーダ," 日本音響学会 2013年春季研究発表会講演論文集, 1-8-12, March 2013.
堀, 久保, 小川, 荒木, 中村, "会話シーン分析の複数人自由会話音声認識におけるディープラーニングの効果," 日本音響学会 2013年秋季研究発表会講演論文集, 1-8-13, September 2013.
小川, 堀, 中村, "識別的認識率推定のN ベスト仮説への拡張とN ベストリスコアリングへの応用," 日本音響学会 2013年春季研究発表会講演論文集, 2-9-4, March 2013.
藤本, 久保, 中谷, "確率モデルに基づく雑音抑圧法における雑音モデルのノンパラメトリックベイズ推定," 日本音響学会 2013年秋季研究発表会講演論文集, 1-8-4, September 2013.
藤本, 中谷, "確率モデルに基づく雑音抑圧法における雑音 HMM パラメータの教師無し推定," 日本音響学会 2013年春季研究発表会講演論文集, 1-9-1, March 2013.
M. Espi, M. Fujimoto, Y. Kubo, and T. Nakatani, "Acoustic modelling of non-speech acoustic events based on deep belief networks," 日本音響学会 2013年秋季研究発表会講演論文集, 1-8-9, September 2013.
デルクロア, 小川, ハム, 中谷, 中村, "dMMI識別基準による教師なし線形回帰音響モデル適応," 日本音響学会 2013年春季研究発表会講演論文集, pp. 181-182, March 2013.
デルクロア, 久保, 中谷, 中村, "ディープニューラルネットワークに基づく音声認識における音声強調フロントエンドの有効性の評価," 日本音響学会 2013年秋季研究発表会講演論文集, September 2013.
久保, 堀, 中村, "Deep neural network とWFST型構造識別器を用いた大語彙連続音声認識," 日本音響学会 2013年春季研究発表会講演論文集, 2-9-13, March 2013.
大庭, 小川, 堀, 政瀧, 中村, "誤り率推定器を用いた識別的言語モデルの教師なし学習," 日本音響学会 2013年秋季研究発表会講演論文集, 1-8-14, September 2013.
石川, 西田, 藤本, 山本, "音声の周期・非周期成分分解に基づく話者認識の検討," 電子情報通信学会, 音声研究会, SP2012-102, pp. 25-30, January 2013.
木下, 中谷, "音源信号の距離減衰・局在性を考慮した大規模分散マイク音源分離に関する一検討," 日本音響学会 2013年秋季研究発表会講演論文集, 2013.
T. Yoshioka and M. J. F. Gales, "An investigation of single-microphone automatic meeting transcription," present at the 2nd UKSpeech Conference, September 2013.
伊藤, 荒木, 中谷, "時変混合重みに基づくパーミュテーション問題のないクラスタリングベース音源分離," 電子情報通信学会技術報告, vol. 113, no. 27, EA2013-2, pp. 7-12, May 2013.
伊藤, ジャファリ, 荒木, 中谷, "音源アクティビティ系列のクラスタリングに基づく高残響・劣決定下音源数推定法," 電子情報通信学会技術報告, vol. 113, no. 242, EA2013-66, pp. 17-21, October 2013.
ソウデン, 木下, 中谷, "ノード内・ノード間情報の統合に基づく分散マイクアレイ音源分離," 日本音響学会 2013年春季研究発表会講演論文集, pp. 797-798, March 2013.
小林, 大室, 木下, 中谷, "パワー領域線形回帰モデルに基づく低演算量・オンライン残響抑圧," 日本音響学会 2013年秋季研究発表会講演論文集, 2013.
鎌土, 小橋川, 木下, 政瀧, 高橋, "モバイル音声認識における主話者音声区間検出への残響除去法の応用," 日本音響学会 2013年春季研究発表会講演論文集, 2013.
村松, "Slepian-Wolf の定理," 電子情報通信学会総合大会予稿集, AT-3-2, 2013.
村松, "ブロードキャスト通信路符号化の内界について," 第36 回情報理論とその応用シンポジム予稿集, pp. 255-260, 2013.
J. Muramatsu, "Constructions of a code for wiretap channel with state by using a constrained random number generator," 第8 回シャノン理論ワークショップ予稿集, pp. 55-64, 2013.
掛巣, 小泉, 森勝, 会田, 野澤, 内田, 吉村, 村松, デイビス, "カスケード接続された半導体レーザにおける共通ランダム信号入力同期を用いた秘密鍵配送実験," a Proceedings of Optics & Photonics Japan, 2013.
高橋, 秋澤, 山崎, 内田, 原山, 都築, 砂田, 吉村, 新井, デイビス, "外部共振器を有するカオス発生用光集積回路による高速物理乱数生成," 信学技報113,no. 116, pp. 93-98, 2013.
砂田, 新井, 篠原, 原山, "レーザカオスによる量子ノイズ増幅と物理乱数生成," 日本応用数理学会第９回研究部会連合発表会, March 14-15, 2013.
福嶋, 篠原, 砂田, 原山, 新井, 坂口, 徳田安紀, "擬似スタジアム型半導体レーザーにおける最低次リングモードの選択励起" 第453回研究会「レーザー計測その他」, December 3, 2013.

2012

論文

T. Hori, S. Araki, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, K. Kinoshita, T. Nakatani, A. Nakamura, and J. Yamato, "Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 499-513, February 2012.
A. Ogawa and A. Nakamura, "Joint estimation of confidence and error causes in speech recognition," Speech Communication, vol. 54, no. 9, pp. 1014-1028, November 2012.
M. Fujimoto, S. Watanabe, and T. Nakatani, "Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection," Speech Communication, vol. 54, no. 2, pp. 229-244, February 2012.
Y. Kubo, S. Watanabe, T. Hori, and A. Nakamura, "Structural classification methods based on weighted finite-state transducers for automatic speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, issue 8, pp. 2240-2251, October 2012.
T. Oba, T. Hori, A. Nakamura, and A. Ito, "Round-robin duel discriminative language models," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, Issue 4, pp. 1244-1255, May 2012.
T. Oba, T. Hori, A. Nakamura, and A. Ito, "Model shrinkage for discriminative language models," IEICE Transactions on Information and Systems, vol. E95-D, No. 5, pp. 1465-1474, May 2012.
T. Yoshioka, and T. Nakatani, "Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening", IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 10, pp. 2707-2720, December 2012.
M. Souden, M. Delcroix, K. Kinoshita, T, Yoshioka, and T. Nakatani, "Noise power spectral density tracking: A maximum likelihood perspective," IEEE Signal Processing Letters, vol. 19, no. 8, pp. 495-498, August 2012.
K. Ishiguro, T. Yamada, S. Araki, T. Nakatani, and H. Sawada, "Probabilistic speaker diarization with bag-of-words representations of speaker angle information," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 447-460, 2012.
E. Vincent, S. Araki, F. Theis, G. Nolte, P. Bofill, H. Sawada, A. Ozerov, V. Gowreesunker, D. Lutter, and N. Q. K. Duong, "The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges," Signal Processing vol. 92, issue 8, pp. 1928-1936, August 2012.
J. Muramatsu and S. Miyake, "Construction of codes for wiretap channel and secret key agreement from correlated source outputs based on hash property," IEEE Transactions on Information Theory, vol. IT-58, no. 2, pp. 671-692, February 2012.
J. Muramatsu and S. Miyake, "Corrections to "Hash property and fixed-rate universal coding theorems,"" IEEE Transactions on Information Theory, vol.IT-58, no. 5, pp. 3305-3307, May 2012.
K. Yoshimura, J. Muramatsu, P. Davis, T. Harayama, A. Uchida, H. Okumura, S. Morikatsu, H. Aida, and A. Uchida, "Secure key distribution using correlated randomness in lasers driven by common random light," Physical Review Letters, vol. 108, 070602, February 2012.
K. Yoshimura, "Stability of discrete breathers in diatomic nonlinear oscillator chains," Nonlinear theory and its applications, IEICE 3, pp. 52-66, 2012.
K. Yoshimura, "Stability of discrete breathers in nonlinear Klein-Gordon type lattices with pure anharmonic couplings," Journal of Mathematical Physics 53, 102701, 2012.
K. Arai, S, Sunada, T, Harayama, and P, Davis, "The randomness in galton board from viewpoint of predictability : sensitivity and statistical bias of output states," Physical Review, E 86, 056216, 2012.
H. Aida, M. Arahata, H. Okumura, H. Koizumi, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Experiment on synchronization of semiconductor lasers by common injection of constant-amplitude random-phase light," Optics Express, vol. 20, no. 11, pp. 11813-11829, May 2012.
T. Harayama, S. Sunada, K. Yoshimura, J. Muramatsu, K. Arai, A. Uchida, and P. Davis, "Theory of fast non-deterministic physical random bit generation with chaotic lasers," Physical Review E, vol. 85, 046215, April 2012.
S. Sunada, T. Harayama, P. Davis, K. Tsuzuki, K. Arai, K. Yoshimura, and A. Uchida, "Noise amplification by chaotic dynamics in a delayed feedback laser system and its application to nondeterministic random bit generation," Chaos 22, 047513, 2012.
Y. Akizawa, T. Yamazaki, A. Uchida, T. Harayama, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Fast random number generation with bandwidth-enhanced chaotic semiconductor lasers at 8×40 Gb/s," IEEE Photonics Technolgy Letters 24, pp. 1042-1044, 2012.
T. Mikami, K. Kanno, K. Aoyama, A. Uchida, T. Ikeguchi, T. Harayama, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Estimation of entropy rate in a fast physical random-bit generator using a chaotic semiconductor laser with intrinsic noise," Physical Review, E 85, 016211, January 2012.
S. Sunada, T. Harayama, P. Davis, K. Tsuzuki, K. Arai, K. Yoshimura, and A. Uchida, "Noise amplification in high dimensional chaotic laser systems and its application to nondeterministic physical random bit generation," Chaos: Interdisciplinary Journal of Nonlinear Science vol. 22, 047513, 2012.
T. Hirayama, S. Arakawa, K. Arai, and M. Murata, "Dynamics of feedback-induced packet delay in ISP router-level topologies," IEICE Transactions on Communications, vol. E95-B, no. 9, pp. 2785-2793, 2012.
J.-W. Ryu, J. Cho, C.-M. Kim, S. Shinohara, and S. W. Kim, "Terahertz beat frequency generation from two-mode lasing operation of coupled microdisk laser," Optics Letters 37, pp. 3210-3213, 2012.

書籍、解説記事

藤本, "音声区間検出の基礎と世界的な研究動向, 今後の展開," 電子情報通信学会誌, vol. 95, no. 8, pp. 754-758, August 2012.
T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, "Making machines understand us in reverberant rooms: robustness against reverberation for automatic speech recognition," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 114-126, November 2012.
吉岡, 中谷, "確率モデルを用いた音声強調 ‐雑音抑圧, 音源分離, 残響除去, 統合技術及びその応用‐," 日本音響学会誌, vol. 68, no. 11, pp. 572-577, November 2012, 招待論文.
吉村, 篠原, 新井, "半導体レーザカオスを利用した高速物理乱数生成," NTT 技術ジャーナル24, no.9, pp. 19-22, 2012.
K. Yoshimura, S. Shinohara, and K. Arai, "Fast Physical Random Number Generation Using Semiconductor Laser Chaos," NTT Technical Review 10, no.11, 2012.

国際会議予稿

T. Hori, K. Kinoshita, S. Araki, A. Ogawa, T. Yoshioka, M. Fujimoto, T. Oba, M. Delcroix, M. Souden, Y. Kubo, S.-J. Hahm, D. Mikami, K. Otsuka, T. Nakatani, A. Nakamura, and J. Yamato, "Real-time audio-visual meeting recognition and understanding using distant microphone array," ICASSP2012, Show & Tell session.
A. Ogawa, T. Hori and A. Nakamura, "Recognition rate estimation based on word alignment network and discriminative error type classification," SLT, 2012.
A. Ogawa, T. Hori, and A. Nakamura, "Error type classification and word accuracy estimation using alignment information in word confusion network," ICASSP2012, pp. 4925-4928, March 2012.
M. Fujimoto and T. Nakatani, "A reliable data selection for model-based noise suppression using unsupervised joint speaker adaptation and noise model estimation," ICSPCC 2012, pp. 4713-4716, Aug 2012. (invited talk)
M. Fujimoto, S. Watanabe, and T. Nakatani, "Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation," ICASSP2012, pp. 4713-4716, March 2012.
M. Espi, M. Fujimoto, D. Saito, N. Ono, and S. Sagayama, "A tandem connectionist model using combination of multi-scale spectro-temporal features for acoustic event detection," ICASSP2012, pp. 4293-4296, March 2012.
M. Delcroix, A. Ogawa, T. Nakatani, and A. Nakamura, "Dynamic variance adaptation using differenced maximum mutual information," MLSLP, 2012.
M. Delcroix, A. Ogawa, S. Watanabe, T. Nakatani, and A. Nakamura, "Discriminative feature transforms using difference maximum mutual information," ICASSP2012, pp. 4753-4756, March 2012.
Y. Kubo, T. Hori, and A. Nakamura, "Integrating deep neural networks into structured classification approach based on weighted finite-state transducers," Interspeech2012, September 2012.
T. Oba, T. Hori, A. Nakamura, and A. Ito, "Spoken document retrieval by discriminative modeling in a high dimensional feature space," ICASSP2012, pp.5153-5156, March 2012.
S.-J. Hahm, A. Ogawa, M. Fujimoto, T. Hori, and Atsushi Nakamura, "Speaker adaptation using variational Bayesian linear regression in normalized feature space," Interspeech2012, September 2012.
S.-J. Hahm, S. Watanabe, M. Fujimoto, T. Hori, and A. Nakamura, "Normalization and adaptation by consistently employing MAP estimation," IWSML 2012.
S. Watanabe, Y. Kubo, T. Oba, T. Hori, and A. Nakamura, "Bag of arcs: new representation of speech segment features based on finite state machines," ICASSP2012, pp. 4201-4204, March 2012.
S. Kobashikawa, T. Hori, Y. Yamaguchi, T. Asami, H. Masataki, and S. Takahashi, "Efficient beam width control to suppress excessive speech recognition computation time based on prior score range normalization," Interspeech2012, September 2012.
S. Kobashikawa, T. Hori, Y. Yamaguchi, T. Asami, H. Masataki, and S. Takahashi, "Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation," SLT, 2012.
S. Yamahata, Y. Yamaguchi, A. Ogawa, H. Masataki, O. Yoshioka, and S. Takahashi, "Automatic vocabulary adaptation based on semantic similarity and speech recognition confidence measure," Interspeech2012, September 2012.
E. Chuangsuwanich, S. Watanabe, T. Hori, T. Iwata, and J. Glass, "Handling uncertain observations in unsupervised topic-mixture language model adaptation," ICASSP2012, pp. 5033-5036, March 2012.
M. Suzuki, T. Yoshioka, S. Watanabe, N. Minematsu, and K. Hirose, "MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments," ICASSP2012, pp. 4109-4112, March 2012.
R. Roller, S. Watanabe and T. Iwata, "Effect of dialog acts on word use in polylogue," ICASSP2012, pp. 4969-4972, March 2012.
T. Nakatani, T. Yoshioka, S. Araki, M. Delcroix, and M. Fujimoto, "Logmax observation model with mfcc-based spectral prior for reduction of highly nonstationary ambient noise," ICASSP2012, pp. 4029-4032, March 2012.
S. Araki and T. Nakatani, "Sparse vector factorization for underdetermined BSS using wrapped-phase GMM and source log-spectral prior," ICASSP2012, pp. 265-268, March 2012.
S. Araki, F. Nesta, E. Vincent, Z. Koldovsky, G. Nolte, A. Ziehe, and A. Benichoux, "SiSEC2011 overview: Audio source separation," in Proc. LVA/ICA2012, pp. 414-422, March 2012.
K. Kinoshita, M. Delcroix, M. Souden, and T. Nakatani, "Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise," Interspeech2012.
T. Yoshioka and T. Nakatani, "Time-varying residual noise feature model estimation for multi-microphone speech recognition," ICASSP2012, pp. 4913-4916, March 2012.
T. Yoshioka, and D. Sakaue, "Log-normal matrix factorization with application to speech-music separation," SAPA-SCALE 2012, pp. 80-85, September 2012.
T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, "Survey on approaches to speech recognition in reverberant environments," APSIPA, 2012. (Invited paper)
M. Souden, K. Kinoshita, M. Delcroix, and T. Nakatani, "Distributed microphone array processing for speech source separation with classifier fusion," MLSP, September 2012.
M. Souden, S. Araki, K. Kinoshita, T. Nakatani, and H. Sawada, "A multichannel MMSE-based framework for joint blind source separation and noise reduction," ICASSP2012, pp. 109-112, March 2012.
T. Maruyama, S. Araki, T. Nakatani, S. Miyabe, T. Yamada, S. Makino, and A. Nakamura, "New analytical update rule for TDOA inference for underdetermined BSS in noisy environments," ICASSP2012, pp. 269-272, March 2012.
Y. Iwata and T. Nakatani, "Introduction of speech log-spectral priors into dereverberation based on Itakura-Saito distance minimization," ICASSP2012, pp. 245-248, March 2012.
H. Sawada, H. Kameoka, S. Araki, and N. Ueda, "Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization," ICASSP2012, pp. 261-264, March 2012.
G. Nolte, D. Lutter, A. Ziehe, F. Nesta, E. Vincent, Z. Koldovsky, A. Benichoux, and S. Araki, "SiSEC2011 overview: biomedical data analysis," in Proc. LVA/ICA2012, pp. 423-429, March 2012.
T. Maruyama, S. Araki, T. Nakatani, S. Miyabe, T. Yamada, S. Makino, and A. Nakamura, "New analytical calculation and estimation for TDOA inference for underdetermined BSS in noisy environments," APSIPA, 2012.
J. Muramatsu, "Information theoretic security based on bounded observability," DIMACS Workshop on Information-Theoretic Network Security, November 2012.
J. Muramatsu and S. Miyake, "Uniform random number generation by using sparse matrix," Proceedings of the 2012 IEEE Information Theory Workshop, pp. 612-616, 2012.
K. Yoshimura, J. Muramatsu, P. Davis, A. Uchida, and T. Harayama, "Secure key distribution using correlated randomness in optical devices," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp.336-339, 2012.
K. Yoshimura, "Existence and stability of localized modes in one-dimensional nonlinear lattices," The 19th International Symposium on Nonlinear Acoustics, AIP Conference Proceedings 1474, pp. 59-62, 2012.
K. Yoshimura, "Stability of discrete breathers in nonlinear Klein-Gordon type lattices," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 403-406, 2012.
K. Arai, T. Harayama, P. Davis, J. Muramatsu, and S. Sunada, "Multi-bit sampling from chaotic time series in random number generation," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 268-271, 2012.
S. Sunada, T. Harayama, P. Davis, K. Arai, K. Yoshimura, K. Tsuzuki, M. Adachi, and A. Uchida, "Noise amplification based on dynamical instabilities in semiconductor laser systems and its application to nondeterministic random bit generators," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 263-267, 2012.
S. Miyake and J. Muramatsu, "Universal codes on continuous alphabet using sparse matrices," Proceedings of the 2012 International Symposium on Information Theory and its Applications, pp. 493-497, 2012.
S. Miyake and J. Muramatsu, "On a construction of universal network code using LDPC matrices," The Proceedings of the 2012 IEEE International Symposium on Information Theory, pp. 1306-1310, 2012.
H. Koizumi, S. Morikatsu, H. Aida, M. Arahata, T. Nozawa, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Experiment on secure key distribution using correlated random phenomenon in semiconductor lasers," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 340-343, 2012.
T. Yamazaki, Y. Akizawa, A. Uchida, K. Yoshimura, K. Arai, and P. Davis, "Fast random number generation with bandwidth-enhanced chaos and post-processing," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 142-145, 2012.
R. Takahashi, Y. Akizawa, T. Yamazaki, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and P. Davis, "Random number generation with a photonic integrated circuit for fast chaos generation," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 138-141, 2012.
Y. Akizawa, R. Takahashi, H. Aida, T. Yamazaki, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and P. Davis, "Nonlinear dynamics in a photonic integrated circuit for fast chaos generation," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 134-137, 2012.
T. Hirayama, S. Arakawa, K. Arai, and M. Murata, "On the power-law characteristic of link capacity distribution in ISP router-level topologies," 21st International Conference on Computer Communications and Networks (ICCCN12), July 30 - August 2, 2012.

その他会議予稿

堀, 荒木, 小川, ソウデン, デルクロア, 吉岡, 大庭, 藤本, 木下, 久保, 咸, 渡部, 中谷, 中村, "会話分析タスクにおける複数人自由会話の遠隔発話音声認識の評価," 日本音響学会 2012年春季研究発表会講演論文集, 3-P-5, pp. 223-224, March 2012.
堀, 小川, 藤本, 大庭, 久保, 咸, 荒木, ソウデン, デルクロア, 吉岡, 木下, 中谷, 中村, "会話分析タスクにおける複数人自由会話音声認識の改善," 日本音響学会 2012年秋季研究発表会講演論文集, 1-1-19, pp. 55-56, September 2012.
堀, 荒木, 大塚, 中谷, 中村, 大和, "複数人会話シーン分析の研究と今後の展望," 電子情報通信学会技術報告, vol. 112, no. 141, SP2012-52, pp. 13-18, July 2012 (招待講演)
小川, 堀, 中村, "単語コンフュージョンネットワークから得られるアライメント特徴量を用いた誤りタイプ分類と認識精度推定," 日本音響学会 2012年春季研究発表会講演論文集, 3-P-3, March 2012.
小川, 堀, 中村, "単語アライメントネットワークと識別的誤りタイプ分類による認識精度推定", 日本音響学会 2012年秋季研究発表会講演論文集, 2-1-5, September 2012.
藤本, 中谷, "話者適応と雑音混合モデル推定の同時適用を用いた雑音抑圧法への高信頼データ選択の導入," 日本音響学会 2012年秋季研究発表会講演論文集, 1-1-6, September 2012.
デルクロア, 小川, 渡部, 中谷, 中村, "dMMI基準による特徴量変換の識別学習," 日本音響学会 2012年春季研究発表会講演論文集, pp. 121-122, March 2012.
デルクロア, 小川, 中谷, 中村, "dMMI識別基準による教師なし動的分散適応," 日本音響学会 2012年秋季研究発表会講演論文集, pp. 131-132, September 2012.
咸, 渡部, 藤本, 小川, 堀, 中村, "事前分布共有に基づく特徴空間と音響モデルの同時適応," 日本音響学会 2012年春季研究発表会講演論文集, 2-7-5, March 2012.
S.-J. Hahm, A. Ogawa, M. Fujimoto, T. Hori, and A. Nakamura, "Feature space variational Bayesian linear regression," 日本音響学会 2012年秋季研究発表会講演論文集, 3-P-11, September 2012.
M. Souden, K. Kinoshita, M. Delcroix, and T. Nakatani, "A multichannel MMSE-based approach for speech source separation and noise reduction," 日本音響学会 2012年春季研究発表会講演論文集, pp. 857-858, 2012.
大庭, 堀, 中村, 伊藤, "高次元識別モデルによる音声ドキュメント検索," 日本音響学会 2012年春季研究発表会講演論文集, 3-7-9, March 2012.
小橋川, 堀, 山口, 浅見, 政瀧, "音声認識処理時間の安定化のための事前スコアレンジ正規化に基づくビーム幅制御," 日本音響学会 2012年春季研究発表会講演論文集, 2-7-7, March 2012.
中谷, 吉岡, 荒木, デルクロア, 藤本, "音声と雑音のメル周波数ケプストラム係数GMM に基づくモノラル／マルチチャンネル雑音抑圧," 日本音響学会 2012年春季研究発表会講演論文集, 1-Q-14, pp. 853-856, March 2012.
木下, 吉岡, 中谷, "音声・音楽信号の残響除去・制御技術とその応用～音声をより聞き取りやすく、音楽をより豊かに～," 電子情報通信学会技術研究報告, vol. 112, no. 266, EA2012-80, pp. 91-96, October 2012.
久保, 渡部, 堀, 中村, "重み付き有限状態トランスデューサに基づく構造識別モデルの最小状態遷移エラー学習," 日本音響学会 2012年春季研究発表会講演論文集, 1-P-20, March 2012.
岩田, 中谷, 藤本, 吉岡, 齋藤, "対数スペクトル事前分布を用いた音声の MAP スペクトル推定と雑音抑圧による評価," 日本音響学会 2012年秋季研究発表会講演論文集, 1-P-30, pp. 795-798, September 2012.
岩田, 中谷, 藤本, 吉岡, 齋藤, "対数スペクトル事前分布を用いたMAP スペクトル推定に基づく劣決定音源分離," 電子情報通信学会技術研究報告, vol. 112, no. 347, EA2012-114,pp. 29-34, December 2012.
M. Espi, M. Fujimoto, D. Saito, N. Ono, and S. Sagayama, "Acoustic event detection using tandem connectionist model based integration of multi-scale spectral features," 日本音響学会 2012年春季研究発表会講演論文集, 1-P-11, March 2012.
M. Souden, M. Delcroix, K. Kinoshita, T. Yoshioka, and T. Nakatani, "A new recursive approach for noise power spectral density tracking," 日本音響学会 2012年秋季研究発表会講演論文集, pp. 741-742, September 2012.
礒, 荒木, 牧野, 中谷, 澤田, 山田, 宮部, 中村, "フルランク空間相関行列モデルに基づく拡散性雑音除去," 電子情報通信学会総合大会, A-10-9, March 2012.
武田, 亀岡, 澤田, 荒木, "混合DOA モデルに基づく多チャンネル複素NMF による劣決定BSS," 日本音響学会 2012年春季研究発表会講演論文集, 2-1-9, 2012.
丹羽, 日岡, 荒木, 古家, 羽田, "最大SN比法への拡散センシングの適用," 日本音響学会 2012年秋季研究発表会講演論文集, 2012.
澤田, 亀岡, 上田, 荒木, "非負値行列因子分解NMFの多チャンネル拡張," 電子情報通信学会信号処理研究専門委員会
ドゥックズイ, 吉岡, 峯松, 広瀬, "特徴量強調における教師なし話者適応に関する検討," 情報処理学会音声言語情報処理研究会研究報告, vol. 2012-SLP-94, no. 23, December 2012.
鈴木, 吉岡, 渡部, 峯松, 広瀬, "クリーン音声状態の識別に基づく特徴量強調," 日本音響学会 2012年春季研究発表会講演論文集, 1-7-10, pp. 23-26, March 2012.
阪上, 吉岡, 奥乃, "占有的基底インデクスを用いた対数正規行列分解に基づく音声と音楽の分離," 日本音響学会 2012年春季研究発表会講演論文集, 1-1-21, pp. 721-724, March 2012.
堀, 荒木, 大塚, 中谷, 中村, " [招待講演] 複数人会話シーン分析の研究と今後の展望," 電子情報通信学会技術報告, vol. 112, no. 141, SP2012-52, pp. 13-18, July 2012.
秋葉, 岩野, 緒方, 小川, 小野, 篠崎, 篠田, 南條, 西崎, 西田, 西村, 原, 堀, "クラウド時代の新しい音声研究パラダイム," 情報処理学会研究報告, SLP, July 2012.
J. Muramatsu, "Constructions of code for general channel and lossy code for general source," 第35回情報理論とその応用シンポジウム予稿集, pp. 184-189, 2012.
J. Muramatsu, "Alternative general formulas for channel capacity and rate-distortion region," 第35回情報理論とその応用シンポジウム予稿集, pp. 190-194, 2012.
吉村, "非線形klein-gordon 型格子におけるdiscrete breather の安定性," 数理解析研究所講究録, 第1800巻, pp. 72-78, 2012.
新井, 原山, 砂田, デイビス, "ガルトンボードの予測不可能性について," 電子情報通信学会非線形問題研究会, July 5-6, 2012.
新井, 原山, 砂田, デイビス, "ガルトンボードの最終状態の予測不可能性について," 日本物理学会第67回年次大会, March 24-27, 2012.
土井, 中谷, 吉村, "離散ブリーザーの移動性に着目した対称格子の構成法," 日本物理学会第67回年次大会, 関西学院大学, 日本物理学会講演概要集第67巻第1号第2分冊, p. 298, 2012.
秋澤, 高橋, 会田, 山崎, 内田, 原山, 都築, 砂田, 吉村, 新井, デイビス, "高速カオス発生用光集積回路の非線形ダイナミクス," 2012年秋季第73回応用物理学会学術講演会, September 11-14, 2012.
高橋, 秋澤, 山崎, 内田, 原山, 都築, 砂田, 吉村, 新井, デイビス, "高速カオス発生用光集積回路を用いた物理乱数生成," 2012 年秋季第73回応用物理学会学術講演会, September 11-14, 2012.

2011

論文

T. Yoshioka, T. Nakatani, M. Miyoshi, and H. G. Okuno, “New method for blind separation and dereverberation of highly reverberant mixtures,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 1, pp. 69-84, January 2011.
A. Ogawa, S. Takahashi, and A. Nakamura, “Efficient combination of likelihood recycling and batch calculation for fast acoustic likelihood calculation,” IEICE TRANS. INF. & SYST., VOL.E94-D, NO.3 March 2011.
S. Araki, T. Nakatani, and H. Sawada, “Sparse source separation based on simultaneous clustering of source locational and spectral features,” Acoustical Science and Technology, Acoustic Letter, (in press), 2011.
T. Harayama, S. Sunada, K. Yoshimura, K. Tsuzuki, P. Davis, and A. Uchida, “Fast non-deterministic random bit generation with on-chip chaos lasers,” Physical Review A Vol. 83 031803(R), 2011.
S. Sunada, T. Harayama, K. Arai, K. Yoshimura, P. Davis, K. Tsuzuki, and A. Uchida, “Chaos laser chips with delayed optical feedback using a passive ring waveguide,” Optics Express, Vol. 19 pp. 5713-5724, 2011.
S. Sunada, T. Harayama, K. Arai, K. Yoshimura, K. Tsuzuki, A. Uchida, and P. Davis, “Random optical pulse generation with bistable semiconductor ring lasers,” Optics Express Vol. 19, pp. 7439-7450, 2011.
K. Yoshimura, “Existence and stability of discrete breathers in diatomic Fermi-Pasta-Ulam type lattices,” Nonlinearity 24, 293-317, 2011.
S. Watanabe, T. Iwata, T. Hori, A. Sako, and Y. Ariki, “Topic Tracking Language Model for Speech,” Computer Speech and Language, vol. 25, issue 2, pp. 440-461, 2011.
吉村和之, 内田淳史, Peter Davis, 村松純, 原山卓久, 砂田哲, “共通ランダム位相変調光による半導体レーザの同期,” (レーザー研究に採録決定).
渡辺秀行, 谷口真一, 片桐滋, 山田幸太, 中村篤, マクダーモットエリック, 渡部晋治, 大崎美穂, “逐次増加型最小分類誤り学習によるパターン認識,” 電子情報通信学会論文誌

書籍、解説記事

M. Fujimoto, “Chapter 1: Integration of statistical model-based voice activity detection and noise suppression for noise robust speech recognition,” in "Advances in Robust Speech Recognition Technology,'' Bentham Publishing Services, March 2011.
木下慶介, 吉岡拓也, 中谷智広, “音声のブラインド残響除去：最新の研究動向,” to appear in 電子情報通信学会 Fundamentals Review, April 2011.
原山卓久、砂田哲、都築健 “レーザカオス光集積回路と高速物理乱数生成,” レーザ研究　2011年　掲載予定

国際会議予稿

M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S. Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.-J. Hahm and A. Nakamura, ``Speech Recognition in the Presence of Highly Non-Stationary Noise Based on Spatial, Spectral and Temporal Speech/Noise Modeling Combined with Dynamic Variance Adaptation,'' in Proc. CHiME Workshop, pp. 12-17, 2011.
S. Sunada, T. Harayama, K. Arai, K. Yoshimura, K. Tsuzuki, A. Uchida, and P. Davis, “Theory and experiment of fast non-deterministic random bit generation with on-chip chaos lasers,” Dynamics Days 2011, pp. 31-32, January 2011.
S. Araki, T. Hori, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, M. Delcroix, K. Kinoshita, T. Nakatani, A. Nakamura, J. Yamato, “Low-latency meeting recognition and understanding using distant microphones,” to appear in Proceedings of the 3rd Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2011), May 2011, presented in the Demo Session.
M. Fujimoto, S. Watanabe, and T. Nakatani, “Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition,” Proc. of ICASSP '11, May 2011. (accepted)
A. Ogawa, S. Takahashi, and A. Nakamura, “Machine and acoustical condition dependency analyses for fast acoustic likelihood calculation techniques,” Proc. ICASSP, May 2011, to appear.
T. Yoshioka, and T. Nakatani, “A microphone array system integrating beamforming, feature enhancement, and spectral mask-based noise estimation,” to appear in Proceedings of the Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2011), May 2011.
T. Yoshioka, and T. Nakatani, “Speech enhancement based on log spectral envelope model and harmonicity-derived spectral mask, and its coupling with feature compensation,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
N. Yasuraoka, H. Kameoka, T.Yoshioka, and H. G. Okuno, “I-divergence-based dereverberation method with auxiliary function approach,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
T. Nakatani, S. Araki, T. Yoshioka, and M. Fujimoto, “Joint unsupervised learning of hidden Markov source models and source location models for multichannel source separation,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
Y. Kubo, S. Wiesler, R. Schlueter, H. Ney, S. Watanabe, A. Nakamura, T. Kobayashi, “Subspace Pursuit Method for Kernel-Log-Linear Models,” Proc. ICASSP 2011, Prague, Chez, May 2011.
S. Araki, T. Hori, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, M. Delcroix, K. Kinoshita, T. Nakatani, A. Nakamura, and J. Yamato, “Demonstration on low-latency meeting recognition and understanding using distant microphones,” HSCMA2011, (accepted).
M. Delcroix, S. Watanabe, T Nakatani, and A Nakamura, “Discriminative approach to dynamic variance adaptation for noisy speech recognition,” Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 (to appear).
S. Araki and T. Nakatani, “Hybrid Approach for Multichannel Source Separation Combining Time-frequency Mask with Multi-channel Wiener Filter,” ICASSP2011, (accepted).
H. Sawada, H. Kameoka, S. Araki, and N. Ueda, “FORMULATIONS AND ALGORITHMS FOR MULTICHANNEL COMPLEX NMF,” ICASSP 2011, (accepted)
K. Iso, S. Araki, S. Makino, T. Nakatani, H. Sawada, T. Yamada, and A. Nakamura, “BLIND SOURCE SEPARATION OF MIXED SPEECH IN A HIGH REBERBERATION ENVIRONMENT,” HSCMA2011, (accepted)
T. Oba, T. Hori, A. Ito, and A. Nakamura, “Round-Robin Duel Discriminative Language Models in One-pass Decoding with On-the-fly Error Correction,” Proceedings of ICASSP, 2011.
S. Watanabe, D. Mochihashi, T. Hori, and A. Nakamura, “Gibbs Sampling Based Multi-Scale Mixture Model for Speaker Clustering,” Proc. ICASSP'11.
D. Saito, S. Watanabe, A. Nakamura, and N. Minematsu, “High Accurate Model-Integration-Based Voice Conversion Using Dynamic Features and Model Structure Optimization,” Proc. ICASSP'11.
T. Maekawa and S. Watanabe, “Modeling Activities with User's Physical Characteristics Data,” Proc. ISWC'11.

その他会議予稿

木下慶介, “残響制御技術Revtrinaを用いた新しいサラウンド化再生方式の提案,” S(Japan Audio society) Journal (日本オーディオ協会ジャーナル誌) 1月号, 2011.
吉岡拓也, 中谷智広, “雑音低減・耐雑音音声認識のためのスペクトル強調と特徴量補正の統合的アプローチ,” 電子情報通信学会技術研究報告, vol. 110, no. 401, SP2010-107, pp. 25-30, January 2011.
吉岡拓也, 中谷智広, “スペクトル・特徴量ドメインの統合による非定常雑音の高解像度推定,” 日本音響学会 2011年春季研究発表会講演論文集, 2-9-7, pp. 687-690, March 2011.
中谷智広, 荒木章子, 吉岡拓也, 藤本雅清, “音源スペクトルHMM と音源方向モデルの教師無し同時学習に基づく多チャンネル音源分離,” 日本音響学会 2011年春季研究発表会講演論文集, 1-Q-21, pp. 805-808, March 2011.
鈴木雅之, 吉岡拓也, 渡部晋治, 峯松信明, “雑音特徴量を用いた劣化音声特徴量変換に関する検討,” 日本音響学会 2011年春季研究発表会講演論文集, 1-Q-15, pp. 795-798, March 2011.
安良岡直希, 亀岡弘和, 吉岡拓也, 奥乃博, “補助関数法を用いたIダイバージェンス規準残響抑圧,” 日本音響学会 2011年春季研究発表会講演論文集, 3-1-9, pp. 1049-1052, March 2011.
武田, 亀岡, 澤田, 荒木, 山田, 牧野, “音源のW-DO性を仮定した多チャンネル複素NMF による劣決定BSS,” 日本音響学会2011年春季研究発表会, 1-Q-19, pp. 801-804, March 2011.
小橋川哲, 浅見太一, 山口義和, 阪内澄宇, 小川厚徳, 政瀧浩和, 高橋敏, 河原達也, “衆議院会議録作成における音声認識システム -事前音響処理-, ” 音講論集, 3-5-9, March 2011.
堀貴明, 中村篤, 山口義和, 小橋川哲, 浅見太一, 政瀧浩和, 高橋敏, 河原達也, “衆議院会議録作成における音声認識システム－探索技術－,” 日本音響学会講演論文集, 3-5-8, March 2011.
渡部晋治, 持橋大地, 堀貴明, 中村篤, “Gibbs サンプリングに基づく多重混合ガウス分布モデルの提案と話者クラスタリングへの適用” 音響学会講演論文集, 1-5-6, March 2011.
俵直弘, 渡部晋治, 小川哲司, 小林哲則, “発話を単位としたディリクレ過程混合モデルに基づく話者クラスタリング” 音響学会講演論文集, 1-Q-15(f), March 2011.
久保, 渡部, 中村, ウィスラー, シュルーター, ナイ, 小林, “カーネルマシンを内包する音響モデルの高速化に向けた部分空間追跡法,” 日本音響学会春季全国大会, 1-5-11, 2011.
荒木, 中谷, “時間周波数マスクと多chウィーナフィルタによるハイブリッド音源分離アプローチ,” 日本音響学会2011年春季研究発表会, 2011.
礒, 荒木, 牧野, 中谷, 澤田, 山田, 中村, “高残響下で混合された音声の音源分離に関する研究, ” 日本音響学会2011年春季研究発表会, 2011.
砂田哲 “リングレーザチップを用いたランダム光パルス生成の理論と実験” 応用数理学会2011年研究部会連合発表会, 応用カオス研究会　(電気情報通信大学)
砂田哲、原山卓久、新井賢一、吉村和之、ピーターデイビス、内田淳史 “レーザカオスによる高速・非決定論的物理乱数生成に関する理論的考察” 2011年電子情報通信学会総合大会, A-2-24,（東京都市大学）
砂田哲, 原山卓久, 新井賢一, 吉村和之, ピーターデイビス, 都築健, 内田淳史 “モノリシック集積化半導体カオスレーザチップを用いた高速物理乱数生成” 第５８回応用物理学会学術連合会　27a-BW-7　(神奈川工科大学)．
三上拓也, 菅野円隆, 青山幸太, 内田淳史, 原山卓久, 砂田　哲, 新井賢一, 吉村和之, Peter Davis, “ノイズを含む半導体レーザカオスを用いた物理乱数生成器のエントロピー生成率,” 2011年電子情報通信学会　総合大会　A-2-24 (東京都市大学)
奥村　悠, 染谷弘行, 内田淳史, 吉村和之, ピーターデイビス “半導体レーザにおけるランダム位相変調光を用いた共通信号入力同期実験” 第５８回応用物理学会学術連合会　27a-BW-8　(神奈川工科大学)．
秋澤康裕, 山崎泰基, 内田淳史, 原山卓久, 砂田　哲, 新井賢一, 吉村和之, ピーターデイビス “帯域拡大された半導体レーザカオスを用いた物理乱数生成の後処理方式,” 第５８回応用物理学会学術連合会　27a-BW-9　(神奈川工科大学)．
山崎泰基, 秋澤康裕, 森勝進一朗, 奥村　悠, 会田裕貴, 内田淳史, 原山卓久, 砂田　哲, 新井賢一, 吉村和之, ピーターデイビス “スーパールミネッセントダイオードを用いた高速物理乱数生成実験,” 第５８回応用物理学会学術連合会　27a-BW-10　(神奈川工科大学)．
木下慶介, 吉岡拓也、中谷智広, “音声のブラインド残響除去：最新の研究動向,” IEICE Fundamental Review, Vol.4, No.4, pp.301-310, 2011

2010

論文

T. Yoshioka, T. Nakatani, M. Miyoshi, and H. G. Okuno, “New method for blind separation and dereverberation of highly reverberant mixtures,” accepted for publication in IEEE Transactions on Audio, Speech, and Language Processing, now available on IEEE Xplore, January 2010.
T. Oba, T. Hori, and A. Nakamura, “Improved Sequential Dependency Analysis Integrating Labeling-based Sentence Boundary Detection,” IEICE, Vol.E93-D,No.5,pp.-, May 2010.
J. Muramatsu, and S. Miyake “Hash property and coding theorems for sparce matrices and maximal-likelihood coding,” IEEE Transactions on Information Theory, vol. IT-56, no. 5, pp. 2143-2167, May 2010.
J. Muramatsu, and S. Miyake “Hash property and fixed-rate universal coding theorems,” IEEE Transactions on Information Theory, vol. IT-56, no. 6, pp. 2688-2698, Jun. 2010.
J. Muramatsu, and S. Miyake, “Construction of broadcast channel code based on hash property,” in Proceedings of the 2010 IEEE International Symposium on Information Theory, pp. 575-579, 2010.
H. Sawada, S. Araki and S. Makino, “Underdetermined Convolutive Blind Source Separation via Frequency Bin-wise Clustering and Permutation Alignment,” IEEE Trans. Audio, Speech, and Language Procssing, (条件付採録).
K. Ishizuka, S. Araki, and T. Kawahara, “Speech activity detection for muti-party conversation analyses based on likelihood ratio test on spatial magnitude,” IEEE Transaction on Audio, Speech, and Language Processing (in press).
K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, “Noise robust voice activity detection based on periodic to aperiodic component ratio,” Speech Communication, Vol.52, No.1, pp. 41-60, 2010.
S. Araki, H. Sawada, and S. Makino, “Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers,” IEEE Trans. Audio, Speech, and Language Processing, (submitting)
S. Watanabe and A. Nakamura, “Predictor-Corrector Adaptation based on a Macroscopic Time Evolution System,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, issue 2, pp. 395-406, 2010.
西亀健太, 和泉洋介, 渡部晋治, 西本卓也, 小野順貴, 嵯峨山茂樹, “スパース性に基づくブラインド音源分離を用いたステレオ入力音声認識,” 電子情報通信学会論文誌 D-II, vol. J93-D, no. 3, pp. 303-311, 2010.

書籍、解説記事

伊藤慶明, 堀貴明, “音声認識研究の最近の動向と今後の展望： 7．音声認識の応用システム― 音声文書検索音声対話/音声翻訳の新たな展開 ―,” 日本音響学会誌 66巻 January 2010.
T. Yoshioka, T. Nakatani, K. Kinoshita, and M. Miyoshi, “Speech dereverberation and denoising based on time varying speech model and autoregressive reverberation model,” to appear in Speech Processing in Modern Communication: Challenges and Perspectives, Israel Cohen, Jacob Benesty, and Sharon Gannot (eds.), Springer, pp. 151-182, February 2010.
M. Fujimoto, K. Takeda, and S. Nakamura, “Chapter 4.4.2: An evaluation database for in-car speech recognition and its common evaluation framework,” in "Resources and Standards of Spoken Language Systems - Advances in Oriental Spoken Language Processing, " World Scientific Publishing Co., March 2010.
M. Miyoshi, M. Delcroix, K. Kinoshita, T. Yoshioka, T. Nakatani, and T. Hikichi, “Inverse-filtering for speech dereverberation without the use of room acoustics information,” to appear in Speech Dereverberation, Patrik A. Naylor and Nikolay Gaubitch (eds.), Springer.
M. Fujimoto, “Chapter 1: Integration of statistical model-based voice activity detection and noise suppression for noise robust speech recognition,” in "Advances in Robust Speech Recognition Technology," Bentham Publishing Services. (in publishing)
渡部晋治, “音声認識における音響モデル研究の動向,” 日本音響学会誌66巻１号, pp. 599-604, 2010.
白井克彦編著 “音声言語処理の潮流,” コロナ社, 4.3節分担執筆(出版)

国際会議予稿

T. Yoshioka, T. Nakatani, and H. G. Okuno, “Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure,” in Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), pp. 4270-4273, March 2010.
N. Yasuraoka, T. Yoshioka, T. Nakatani, A. Nakamura, and Hiroshi G. Okuno, “Music dereverberation using harmonic structure source model and Wiener filtering,” in Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), pp. 53-56, March 2010.
T. Hori, S. Watanabe, and A. Nakamura, “Search Error Risk Minimization in Viterbi Beam Search for Speech Recognition,” in Proc. ICASSP2010, pp. 4934-4937, March 2010.
T. Oba, T. Hori and A. Nakamura, “A Comparative Study on Methods of Weighted Language Model Training for Reranking LVCSR N-best Hypotheses,” in Proc. ICASSP2010, pp. 5126-5129, March 2010.
S. Watanabe, T. Hori, E. McDermott, and A. Nakamura, “A Discriminative Model for Continuous Speech Recognition Based on Weighted Finite State Transducers,” in Proc. ICASSP2010, pp. 4922-4925, March 2010.
A. Ogawa and A. Nakamura, “Discriminative confidence and error cause estimation for extended speech recognition function,” Proc. ICASSP, pp. 4454-4457, March 2010.
A. Ogawa and A. Nakamura, “A novel confidence measure based on marginalization of jointly estimated error cause probabilities,” Proc. Interspeech, September 2010.
J. Muramatsu, K. Yoshimura, K., and P. Davis, “Information theoretic security based on bounded observability,” Proceedings of the 4th International Conference on Information Theoretic Security, Lecture Notes on Computer Science (LNCS), vol.5973, pp.128-139, Splinger (in press).
D. Cournapeau, S. Watanabe, A. Nakamura, and T. Kawahara, “Using Online Model Comparison In The Variational Bayes Framework For Online Unsupervised Voice Activity Detection,” ICASSP 2010, pp. 4462-4465, 2010.
E. McDermott, S. Watanabe, and A. Nakamura, “Discriminative Training Based On An Integrated View Of MPE And MMI In Margin And Error Space,” ICASSP 2010, pp. 4894-4897, 2010.
H. Watanabe, S. Katagiri, K. Yamada, E. McDermott, A. Nakamura, S. Watanabe, and M. Ohsaki, “Minimum Error Classification With Geometric Margin Control,” ICASSP 2010, pp. 2170-2173, 2010.
K. Aoyama, S. Watanabe, H. Sawada, Y. Minami, N. Ueda, and K. Saito, “Fast Similarity Search On A Large Speech Data Set With Neighborhood Graph Indexing,” ICASSP 2010, pp. 5358-5361, 2010.
S. Araki, T. Nakatani and H. Sawada, “Simultaneous clustering of mixing and spectral model parameters for blind sparse source separation,” ICASSP2010, 2010.
S. Watanabe, T. Hori, E. McDermott, and A. Nakamura, “A Discriminative Model For Continuous Speech Recognition Based On Weighted Finite State Transducers,” ICASSP 2010, pp. 4922-4925, 2010.
T. Hori, S. Watanabe, and A. Nakamura, “Search Error Risk Minimization In Viterbi Beam Search For Speech Recognition,” ICASSP 2010, pp. 4934-4937, 2010.
T. Nakatani and S. Araki, “SINGLE CHANNEL SOURCE SEPARATION BASED ON SPARSE SOURCE OBSERVATION MODEL WITH HARMONIC CONSTRAINT,” ICASSP2010, 2010.
Y. Ansai, S. Araki, S. Makino, T. Nakatani, T. Yamada, A. Nakamura and N. Kitawaki, “Cepstral Smoothing of Separated Signals for Underdetermined Speech Separation,” ISCAS2010, (to appear)

その他会議予稿

久保陽太郎, 渡部晋治, 中村篤, 小林哲則, “最小相対エントロピー識別学習へのラティスによる仮説表現と並列化可能な最適化手法の導入,” 情報処理学会研究報告, Vol.2010-SLP-80 No.8, February 2010.
吉岡　拓也, 中谷　智広, 奥乃　博, “スペクトル包絡の事前学習と調波構造モデルを併用した音声強調,” 日本音響学会 2010年春季研究発表会講演論文集, 3-5-8, pp. 773-776, March 2010.
安良岡　直希, 吉岡　拓也, 中谷　智広, 中村　篤, 奥乃　博, “調波GMMとWienerフィルタに基づく音楽音響信号の残響抑圧,” 情報処理学会第72回全国大会講演論文集, 5T-4, vol. 2, pp. 181-182, March 2010.
藤本雅清, 渡部晋治, 中谷智広, “Dirichlet事前分布を用いた音声区間検出法の評価と考察,” 日本音響学会, 平成22年度春季研究発表会, 1-6-5, pp. 13-17, March 2010.
田代(AS研), 荒木, 木村, 中村, “停電時上り音声通信を実現する光アクセス方式の提案,” 電子情報通信学会2010年総合大会, March 2010.
渡部晋治, 堀貴明, Erik McDermott, 中村篤, “重み付有限状態トランスデューサを利用した, 連続音声認識のための識別モデルの提案,” 音響学会講演論文集, 1-6-13, March 2010.
藤本雅清, 渡部晋治, 中谷智広, “Dirichlet 事前分布を用いた音声区間検出法の評価と考察,” 音響学会講演論文集, 1-6-5, March 2010.
堀貴明, 渡部晋治, 中村篤, “サーチエラーリスク最小化に基づくViterbiビーム探索法の改善,” 音響学会講演論文集, 2-6-7, March 2010.
増村亮, 大庭隆伸, 伊藤彰則, 牧野正三, “線形分類器による音響モデル,” 音響学会講演論文集, pp. 29-30, March 2010.
大庭隆伸, 南泰浩, “クラス分類問題の強化学習による解釈,” 情報処理全国大会, Vol.2, pp. 93-94, March 2010.
小川厚徳，中村篤， “信頼度と誤り原因の推定における識別モデルの検討,” 音講論集，1-Q-6, March 2010.
小川厚徳，中村篤, “同時推定した誤り原因確率の周辺化に基づく信頼度,” 音講論集，1-Q-19, September 2010.
安齊(筑波大), 荒木, 牧野, 中谷, 山田, 中村, 北脇, “劣決定音源分離のための分離音声のケプストラムスムージング,” 日本音響学会2010年春季研究発表会, 2010.
荒木, 中谷, 澤田, “マイク間位相差とスペクトル包絡の同時クラスタリングに基づくスパース音源分離,” 日本音響学会2010年春季研究発表会, 2010.

2009

論文

T. Yoshioka, T. Nakatani, and M. Miyoshi, “Integrated speech enhancement method using noise suppression and dereverberation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 2, pp. 231-246, February. 2009.
中谷智広, 吉岡拓也, 木下慶介, 三好正人, “時変ガウス音源モデルと多チャンネル自己回帰観測モデルに基づく最ゆう法による音響信号の残響除去,” 電子情報通信学会論文誌 A, vol. J92-A, no. 5, pp. 294-304, May, 2009.
S. Miyake, and J. Muramatsu, “A Construction of Channel Code, Joint Source-Channel Code, and Universal Code for Arbitrary Stationary Memoryless Channels using Sparse Matrices,” IEICE Transactions on Fundamentals, vol.E92-A, no.9, pp.2333-2344, September. 2009.
H. K. Solvang, Y. Nagahara, S. Araki, H. Sawada and S. Makino, “Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation,” IEEE Trans. Speech & Language Processing, vol, 17, no. 4, pp. 639-649, 2009.
K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction” IEEE Transactions on Audio, Speech and Language processing
M. Delcroix, T. Nakatani, and S. Watanabe, “Static and dynamic variance compensation for recognition of reverberant speech with dereverberation pre-processing,” IEEE transactions on Audio, Speech, and Language Processing, vol. 17, issue 2, pp. 324-334, 2009.
S. Araki, H. Sawada, R. Mukai and S. Makino, “DOA estimation for multiple sparse sources with arbitrarily arranged multiple sensors,” Journal of Signal Processing Systems, doi:10.1007/s11265-009-0413-9, 2009.
村松純, 三宅茂樹 “疎行列アンサンブルのハッシュ性と多端子情報源符号,” 統計数理, vol.57, no.2, pp.203-219, 2009.

書籍、解説記事

T. Hori, K. Sudoh, H. Tsukada, and A. Nakamura, “World-Wide Media Browser--Multilingual Audio-visual Content Retrieval and Browsing System,” NTT Technical Review, Vol. 7, No. 2, February 2009.
堀貴明, 須藤克仁, 塚田元, 中村篤, “世界メディアブラウザ,” NTT技術ジャーナル 2009年５月号.
堀貴明, 村松純 “日本企業から米国・欧州大学への派遣体験～米国マサチューセッツ工科大学／スイス連邦チューリヒ工科大学編～,” 電子情報通信学会誌, pp.400-404, 2009.
S. Makino, S. Araki, S. Winter, H. Sawada, “Underdetermined Blind Source Separation using Acoustic Arrays,” Handbook on Array Processing and Sensor Networks, S. Haykin, and K. J. R. Liu Eds., Wiley, 2009 (in press).
石塚健太郎, 藤本雅清, 中谷智広, “音声区間検出技術の最近の研究動向,” 日本音響学会誌, Vol.52, No.10, pp.537-543, 2009.

国際会議予稿

T. Yoshioka, H. Tachibana, T. Nakatani, and M. Miyoshi, “Adaptive dereverberation of speech signals with speaker-position change detection,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 3733-3736, April 2009.
H. Kameoka, T. Nakatani, and T. Yoshioka, “Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 45-48, April 2009.
T. Nakatnai, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, “Real-time speech enhancement in noisy reverberant multi-talker environments based on a localtion-independent room acoustics model,” to appear in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 137-140, April 2009.
A. Ogawa, S. Takahashi, and A. Nakamura, “Efficient combination of likelihood recycling and batch calculation based on conditional fast processing and acoustic back-off,” Proc. ICASSP, pp. 4164-4164, April 2009.
T. Yoshioka, T. Nakatani, and M. Miyoshi, “Fast algorithm for conditional separation and dereverberation,” in Proceedings of the 17th European Signal Processing Conference (EUSIPCO 2009), CD-ROM Proceedings, August 2009.
A. Ogawa and A. Nakamura, “Simultaneous estimation of confidence and error cause in speech recognition using discriminative model,” Proc. Interspeech, pp. 1199-1202, September 2009.
S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models,” Proc. Interspeech, pp.1615-1618, September 2009.
T. Yoshioka, H. Kameoka, T. Nakatani, and H. G. Okuno, “Statistical models for speech dereverberation,” in Proceedings of the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2009), pp. 145-148, October 2009.
A. Nakamura, E. McDermott, S. Watanabe, S. Katagiri, “A unified view for discriminative objective functions based on negative exponential of difference measure between strings,” Proc. ICASSP 2009, pp. 1633-1636, 2009.
E. McDermott, S. Watanabe, and A. Nakamura, “Margin-Space Integration of MPE Loss via Differencing of MMI Functionals for Generalized Error-Weighted Discriminative Training,” Proc. Interspeech 2009 Eurospeech, pp. 224-227, 2009.
E. Vincent (IRISA-INRIA), S. Araki, and P. Bofill (カタロニア工科大), “The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation,” ICA2009, pp. 734-741, 2009.
J. Muramatsu, and S. Miyake, “Coding theorem for general stationary memryless channel based on hash property,” Proceedings of the 2009 IEEE International Symposium on Information Theory, Seoul, Korea, pp.541-545, 2009.
J. Muramatsu, and S. Miyake, “Construction of wiretap channel codes by using sparse matrices,” Proceedings of the 2009 IEEE Information Theory Workshop, Taormina, Italy, pp.105-109, 2009.
K. Ishiguro, T. Yamada S. Araki and T. Nakatani, “A PROBABILISTIC SPEAKER CLUSTERING FOR DOA-BASED DIARIZATION,,” WASPAA2009, 2009.
K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani and M. Fujimoto, “A Speaker Diarization Method based on the Probabilistic Fusion of Audio-Visual Location Information,” ICMI-MLMI 2009, 2009.
K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani, and M. Fujimoto, “A speaker diarization method based on the probabilistic fusion of audio-visual location information,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.55-62, 2009.
K. Otsuka, S. Araki, D. Mikami, K. Ishizuka, M. Fujimoto, and J. Yamato, “Realtime meeting analysis and 3D meeting viewer based on omnidirectional multimodal sensors,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.219-220, 2009.
M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proc. of Interspeech '09, pp. 1235-1238, September 2009.
M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proceedings of the 10th Interspeech (Interspeech2009), pp. 1235-1238, 2009.
R. Mugitani, K. Ishizuka, T. Kondo, and S. Amano, “Acquisition of durational control of vocalic and consonantal intervals in speech production,” The 34th Boston University Conference on Language Development (BUCLD34), 2009.
S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior,” ICASSP2009, pp.33-36, 2009.
S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Stereo source separation and source counting with MAP estimation with Dirichlet prior considering spatial aliasing problem,” ICA2009, pp. 742-750, 2009.
S. Watanabe and A. Nakamura, “Speech recognition with incremental tracking and detection of changing environments based on a macroscopic time evolution system,” Proc. ICASSP 2009, pp. 4373-4376, 2009.
T. Iwata, S. Watanabe, T. Yamada, and N. Ueda, “Topic tracking model for analyzing consumer purchase behavior,” IJCAI 2009, pp. 1427-1432, 2009.
T. Tashiro(AS研), S. Araki, Y. Nakanishi(NTT東), H. Kimura(AS研), K. Kumozaki(AS研) and M. Miyoshi, “Optical Access System with Emergency Voice Communication Using Blind Speech Separation for Demultiplexing Randomly Mixed Signals,” GLOBECOM, 2009.
Y. Izumi, K. Nishiki, S. Watanabe, T. Nishimoto, N. Ono, and S. Sagayama, “Stereo-input Speech Recognition using Sparseness-based Time-frequency Masking in a Reverberant Environment,” Proc. Interspeech 2009 Eurospeech , pp. 1955-1958, 2009.
S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using context independent phoneme model,” The 13th IEEE International Symposium on Consumer Electronics (ISCE'09), 2009.

その他会議予稿

藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく音声区間検出法における音声ゲイン補正の検討,” 日本音響学会, 平成21年度春季研究発表会, 1-5-2, pp. 3-6, March 2009.
渡辺秀行, 片桐滋, 山田幸太, 中村篤, マクダーモットエリック, 渡部晋治, 谷口真一, 西島奈甫, 大崎美穂, “アンサンブル型最小分類誤り学習の提案” 信学技報, vol. 108, no. 484, PRMU2008-250, pp. 71-76, March 2009.
D. Cournapeau, S. Watanabe, A. Nakamura, and T. Kawahara, “Using online free energy for model comparison with application to voice activity detection,” 音響学会講演論文集, 2-5-14, March 2009.
久保陽太郎, 渡部晋治, 中村篤, 白井克彦, “最小相対エントロピー基準によるパラメタ分布の正則化を用いた連続分布HMM の識別学習,” 音響学会講演論文集, 2-5-16, March 2009.
小川厚徳，中村篤, “最大エントロピーモデルに基づく信頼度と誤認識原因の同時推定,” 音講論集，2-5-17, March 2009.
小橋川哲，小川厚徳，山口義和，高橋敏， “音素環境独立モデルに基づく高速教師なし適応の検討,” 音講論集，1-P-30, March 2009.
渡部晋治, “[招待講演] 音響モデルのベイズ学習,” 情報処理学会研究報告, Vol.2009-SLP-77, No. 9, July 2009.
久保陽太郎, 渡部晋治, 中村篤, マクダーモットエリック, 小林哲則, “最小相対エントロピー識別学習に基づくカーネルマシンを利用した音声認識,” 情報処理学会研究報告, Vol.2009-SLP-77, No.6, July 2009.
山田幸太, 片桐滋,マクダーモットエリック, 渡辺　秀行, 中村篤, 渡部晋治, 大崎美穂 “最小分類誤り学習における幾何マージンの制御法について,” 信学技報, vol. 109, no. 139, SP2009-43, pp. 13-18, July 2009.
渡辺秀行, 片桐滋, 山田幸太, マクダーモットエリック, 中村篤, 渡部晋治, 大崎美穂, “判別関数の一般形に対する幾何マージンの導出とその制御を伴う最小分類誤り学習,” 信学技報, vol. 109, no. 182, PRMU2009-60, pp. 1-6, August 2009.
吉岡　拓也, 亀岡弘和, 中谷智広, 奥乃　博, “少量データに頑健な残響抑圧のためのMSPP法,” 日本音響学会 2009年秋季研究発表会講演論文集, 2-4-1, pp. 609-612, September 2009.
吉岡　拓也, 中谷智広, 奥乃　博, “重みつき予測誤差法におけるMIMO残響除去フィルタの効率的最適化法,” 日本音響学会 2009年秋季研究発表会講演論文集, 2-4-17, pp. 651-654, September 2009.
藤本雅清, 中谷智広, “確率モデルに基づく音声区間検出法における確率分布選択と確率重み付けの検討,” 日本音響学会, 平成21年度秋季研究発表会, 1-1-14, pp. 43-46, September 2009.
渡部晋治, 岩田具治, 堀貴明, 佐古淳, 有木康雄, “話題追従型言語モデルについての考察,” 音響学会講演論文集, 2-1-3, September 2009.
久保陽太郎, 渡部晋治, 中村篤, マクダーモットエリック, 小林哲則, “隠れマルコフモデルの最小相対エントロピー識別学習則より導出されるカーネルマシンを用いた音声認識,” 音響学会講演論文集, 1-1-4, September 9.
堀貴明, 渡部晋治, 中村篤, “サーチエラーリスク最小化に基づくビーム探索,” 音響学会講演論文集, 3-1-8, September 2009.
大庭隆伸, 堀貴明, 中村篤, “誤り訂正言語モデルのシンボル系列重み付き学習法に関する考察,” 音響学会講演論文集, pp. 179-180, September 2009.
藤本雅清, 渡部晋治, 中谷智広, “Dirichlet事前分布を用いた音声区間検出の検討,” 情報処理学会研究報告, SLP-79-12, December 2009.
堀貴明, 渡部晋治, 中村篤, “サーチエラーリスク最小化に基づくViterbiビーム探索とその評価,” 第11回音声言語シンポジウム，SP2009-79, pp.31-36, December 2009
藤本雅清, 渡部晋治, 中谷智広, “Dirichlet事前分布を用いた音声区間検出の検討,” 情報処理学会研究報告, Vol.2009-SLP-79 No.12, December 2009.
K. Kinoshita, T. Nakatani, M. Miyoshi and T. Kubota, “Blind upmix of stereo music signal using multi-step linear prediction based reverberation extraction,” International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp49-52, 2009
久保陽太郎, 渡部晋治, 中村篤, マクダーモットエリック, 小林哲則, “隠れマルコフカーネルマシンを用いた系列データの識別とその音素認識タスクへの適用,” 第12回情報論的学習理論IBIS2009, P106.
荒木, 中谷, 澤田, “ディリクレ事前分布を用いた音声のスパース性に基づく音源数推定と音源分離,” 日本音響学会2009年秋季研究発表会, 2009.
小笠原(名大), 石塚, 荒木, 藤本, 中谷, 大塚, “SN 比最大化ビームフォーマを用いたオンライン会議音声強調,” 日本音響学会2009年春季研究発表会, 2009.
小笠原基, 石塚健太郎, 荒木章子, 藤本雅清, 中谷智広, 大塚和弘, “SN比最大化ビームフォーマを用いたオンライン会議音声強調,” 日本音響学会講演論文集, 2-9-17, 春季, pp.695-698, 2009.
石黒, 山田, 荒木, 中谷, “ノンパラメトリックベイズを用いた会議音声話者識別のための話者クラスタリング法,” 日本音響学会2009年春季研究発表会, pp.107-110, 2009.
石塚, 荒木, 大塚, 中谷, 藤本, “音響情報と映像情報から得られる位置情報の統合による話者ダイアライゼーション,” 日本音響学会2009年春季研究発表会, 2009.
石塚健太郎, 荒木章子, 大塚和弘, 中谷智広, 藤本雅清, “音響情報と映像情報から得られる位置情報の統合による話者ダイアライゼーション,” 日本音響学会講演論文集, 3-5-6, 春季, pp.111-112, 2009.
渡辺　秀行, 片桐滋, 山田幸太, マクダーモットエリック, 中村篤, 渡部晋治, 大崎美穂, “ 大幾何マージン最小分類誤り学習法,” 第12回情報論的学習理論IBIS2009, P043.
藤本雅清, 石塚健太郎, 中谷智広, ] “確率モデルに基づく音声区間検出法における音声ゲイン補正の検討,” 日本音響学会講演論文集, 1-5-2, 春季, pp.3-6, 2009.
木下慶介, 中谷智広, 三好正人, “残響除去原理に基づき作成したステレオ音楽サラウンド再生音の主観評価,” 日本音響学会秋季研究発表会, pp.759-760, 2009
J. Muramatsu, and S. Miyake, “Hash property and fixed-rate universal coding theorems,” 第32回情報理論とその応用シンポジム予稿集, pp. 388-393, 2009.
J. Muramatsu, K. Yoshimura, and P. Davis, “Information theoretic security based on bounded observability,” 第7回シャノン理論ワークショップ予稿集, pp. 1-6, 2009.

2008

論文

J. Muramatsu, “Effect of random permutation of symbols in a sequence,” IEEE Transactions on Information Theory, vol.IT-54, no.1, pp.78-86, January. 2008.
J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Some results on secret key agreement using correlated sources,” NTT Technical Review, vol.6, No.2, February. 2008.
M. Fujimoto and K. Ishizuka, “Noise Robust Voice Activity Detection Based on Switching Kalman Filter,” IEICE Transactions on Information and Systems, Vol. E91-D, No. 3, pp. 467-477, March. 2008.
S. Miyake, and J. Muramatsu, “A construction of lossy source code using LDPC matrices, IEICE Transactions on Fundamentals,” vol.E91-A, no.6, pp.1488-1501, June 2008.
T. Oba, T. Hori, and A. Nakamura, “Sequential Dependency Analysis for Online Spontaneous Speech Processing,” Speech Communication, Volume 50, Issue 7, pp. 616-625, July 2008.
T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, “Speech dereverberation based on maximum likelihood estimation with time-varying Gaussian source model,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, pp. 1512-1527, November 2008.
K. Yoshimura, J. Muramatsu, and P. Davis, “Conditions for common-noise-induced synchronization in time-delay systems,” Physica D, vol. 237, no. 23, pp.3146-3152, December. 2008.
H. K. Solvang, K. Ishizuka, and M. Fujimoto, “Voice activity detection based on adjustable linear prediction and GARCH models,” Speech Communication, Vol.50, No.6, pp.476-486, 2008.
T. Nakatani, S. Amano, T. Irino, K. Ishizuka, and T. Kondo, “A method for fundamental frequency estimation and voicing decision: Application to infant utterances recorded in real acoustical environments,” Speech Communication, Vol.50, No.3, pp.203-214, 2008.
大和田功, 山本徹, 葉海鵬, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “半導体レーザを用いた共通信号入力におけるカオス同期の数値解析,” 電気学会論文誌C, vol.128, no.5, pp.768-774, 2008.

書籍、解説記事

堀貴明, 須藤克仁, 塚田元, 中村篤, “世界中の音映像コンテンツを日本語で視聴する技術,” ITUジャーナル 2008年８月号.
S. Makino, S. Araki, and H. Sawada, “Underdetermined Blind Source Separation using Acoustic Arrays,” in Handbook on Array Processing and Sensor Networks, S. Haykin and K.J. Ray Liu, Eds, Wiley, 2008.
村松純, “チューリヒ工科大学滞在記,” SITA ニューズレター, 2008.
白木善尚編, 村松純, 岩田賢一, 有村光晴, 渋谷智治共著, “ IT Text シリーズ情報理論,” オーム社, 2008.

国際会議予稿

T. Yoshioka, T. Nakatani, T. Hikichi, and M. Miyoshi, “Maximum likelihood approach to speech enhancement for noisy reverberant signals,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 4585-4588, March 2008.
T. Yoshioka and M. Miyoshi, “Adaptive suppression of non-stationary noise by using variational Bayesian method,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 4889-4892, March 2008.
T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H., Juang, “Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 85-88, March 2008.
M. Fujimoto and K. Ishizuka, and T. Nakatani, “A Voice Activity Detection Based on the Adaptive Integration of Multiple Speech Features and a Signal Decision Scheme,” Proc. ICASSP '08, pp. 4441-4444, March. 2008.
A. Ogawa and S. Takahashi, “Weighted distance measures for efficient reduction of Gaussian mixture components in HMM-based acoustic model,” Proc. ICASSP, pp. 4173-4176, March 2008.
T. Oba, T. Hori, and A. Nakamura, “Efficient Discriminative Training of Error Corrective Models Using High-WER Competitors,” Asian Workshop on Speech Science and Technology, IEICE Technical Report SP2007-185-214, pp. 99-104, March 2008.
T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H., Juang, “Speech dereverberation in short time Fourier transform domain with cross band effect compensation,” in Proceedings of the 2008 Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2008), pp. 220-223, May 2008.
T. Yoshioka, T. Nakatani, and M. Miyoshi, “An integrated method for blind separation and dereverberation of convolutive audio mixtures,” in Proceedings of the 16th European Signal Processing Conference (EUSIPCO 2008), CD-ROM Proceedings, August 2008.
T. Yoshioka, T. Nakatani, and M. Miyoshi, “Enhancement of noisy reverberant speech by linear filtering followed by nonlinear noise suppression,” in Proceedings of the 2008 International Workshop on Acoustic Echo and Noise Control (IWAENC 2008), CD-ROM Proceedings, September 2008.
T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, “Incremental estimation of reverberation with uncertainty using prior knowledge of room acoustics for speech dereverberation,” in Proceedings of the 2008 International Workshop on Acoustic Echo and Noise Control (IWAENC 2008), CD-ROM Proceedings, September 2008.
M. Fujimoto, K. Ishizuka, and T. Nakatani, “Study of Integration of Statistical Model-Based Voice Activity Detection and Noise Suppression,” Proc. Interspeech '08, September 2008.
M. Miyoshi, K. Kinoshita, T. Nakatani, and T. Yoshioka, “Principles and applications of dereverberation for noisy and reverberant audio signals,” in Proceedings of the 2008 Asilomar Conference on Signals, Systems, and Computers, CD-ROM Proceedings, October 2008.
S. Miyake, and J. Muramatsu, “A construction of channel code, joint source-channel code, and universal code for arbitrary stationary memoryless channels using sparse matrices,” Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, Canada, pp.1193-1197, 2008.
D. Kolossa (TU Berlin), S. Araki , M. Delcroix, T. Nakatani, R. Orglmeister (TU Berlin), S. Makino, “Missing Feature Speech Recognition in a Meeting Situation with Maximum SNR Beamforming,” ISCAS2008, pp. 3218 -3221, 2008.
J. Muramatsu, and S. Miyake, “Hash property and multi-terminal source coding theorems for sparse matrices and maximal-likelihood coding,” Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, Canada, pp.424-428, 2008.
J. Muramatsu, and S. Miyake, “Lossy source coding algorithm using lossless multi-terminal source codes,” Proceedings of the 2008 International Symposium on Information Theory and its Applications, Auckland, New Zealand, pp.606-611, 2008.
K. Ishizuka, S. Araki, and T. Kawahara, “Statistical speech activity detection based on spatial power distribution for analyses of poster presentations,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.99-102, 2008.
K. Ishizuka, S. Araki, T. Kawahara, “Statistical Speech Activity Detection based on Spatial Power Distribution for Analyses of Poster Presentations,” Interspeech2008, pp.99-102, 2008.
K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Heinrich, J. Yamato, “A Realtime Multimodal System for Analyzing Group Meetings by Combining Face Pose Tracking and Speaker Diarization,” ICMI2008, pp. 257-264, 2008.
K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Hinrich, and J. Yamato, “A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization,” Proceedings of the 10th International Conference on Multimodal Interfaces (ICMI2008), pp. 257-264, 2008.
M. Delcroix, T. Nakatani, and S. Watanabe, “Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer,” Proc. ICASSP 2008 pp. 4073-4076, 2008.
M. Fujimoto, K. Ishizuka, and T. Nakatani, “A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme,” Proceedings of the 33rd International Conference on Acoustics, Speech and Signal Processing (ICASSP2008), pp.4441-4444, 2008.
M. Fujimoto, K. Ishizuka, and T. Nakatani, “Study of integration of statistical model-based voice activity detection and noise suppression,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.2008-2011, 2008.
S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, “A DOA based speaker diarization system for real meetings,” Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2008), pp.29-32, 2008.
S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, “Speaker indexing and speech enhancement in real meetings / conversations,” Proceedings of the 33rd International Conference on Acoustics, Speech and Signal Processing (ICASSP2008), pp.93-96, 2008.
S. Watanabe and A. Nakamura, “A unified interpretation of adaptation techniques based on a macroscopic time evolution system with indirect/direct approaches,” Proc. ICASSP 2008 pp. 4285-4286, 2008.
T. Hager, S. Araki, K. Ishizuka, M. Fujimoto, T. Nakatani, and S. Makino, “Handling speaker position changes in a meeting diarization system by combining DOA clustering and speaker identification,” Proceedings of the 11th International Workshop on Acoustic Echo and Noise Control (IWAENC2008), 2008.
T. Hager, S. Araki, K. Ishizuka, M. Fujimoto, T. Nakatani, S. Makino, “Handling speaker position changes in a meeting diarization system by combining DOA clustering and speaker identification,” IWAENC2008 CD-ROM proceedings, 2008.
T. Kawahara, H. Setoguchi, K. Takanashi, K. Ishizuka, and S. Araki, “Multi-modal recording, analysis and indexing of poster sessions,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.1622-1625, 2008.

その他会議予稿

堀貴明, 須藤克仁, 大庭隆伸, 渡部晋治, 渡辺太郎, 塚田元, 中村篤, “「世界メディアブラウザ」－音声認識と統計翻訳に基づく多言語動画コンテンツ　検索／閲覧システム,” 第2回音声ドキュメント処理ワークショップ, pp. 59-64, February 2008.
吉岡　拓也, 中谷　智広, 三好　正人, “雑音と残響の同時抑圧による音声強調,” 日本音響学会 2008年春季研究発表会講演論文集, 3-6-10, pp. 731-732, March 2008.
中谷　智広, 吉岡　拓也, 木下　慶介, 三好　正人, ジュアング　ビン・ファン, “短時間フーリエ変換表現を用いた最尤推定に基づく音声信号の残響除去,” 日本音響学会 2008年春季研究発表会講演論文集, 3-6-11, pp. 733-734, March 2008.
藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく音声区間検出と雑音抑圧の統合の検討,” 日本音響学会, 平成20年度春季研究発表会, 1-10-9, pp. 27-30, March 2008.
荒木, 藤本, 石塚, 澤田, 牧野, “音声区間検出と方向情報を用いた会議音声話者識別システムとその評価,” 日本音響学会2008年春季研究発表会, March 2008.
荒木, 澤田, 牧野, “音声のスパース性を用いたUnderdetermined音源分離,” 電子情報通信学会2008年総合大会, March 2008.
荒木, 伊藤(東大), 澤田, 小野(東大), 牧野, 嵯峨山(東大), “周波数領域ICAにおける初期値の短時間データからの学習,” 電子情報通信学会2008年総合大会, March 2008.
大庭隆伸, 堀貴明, 中村篤, “単語誤り率を考慮した誤り訂正モデル学習とその効果に関する分析,” 日本音響学会講演論文集, pp.128-129, March 2008.
小川厚徳，高橋敏, “状態尤度近似とバッチ状態尤度計算の組み合わせによる音響尤度計算の高速化,” 音講論集，2-10-10, March 2008.
小橋川哲，小川厚徳，政瀧浩和，高橋敏， “キーワードに関する十分統計量増強による精度向上の検討,” 音講論集，1-Q-23, March 2008.
西亀健太, 渡部晋治, 西本卓也, 小野順貴, 嵯峨山茂樹, “複数残響特性下の音声を単一モデル学習に用いた未知残響環境に頑健な音声認識,” 2008-SP-8, pp. 43-48 May 2008.
藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく音声区間検出と雑音抑圧の統合法の評価と考察,” 電子情報通信学会, 音声研究会, SP2008-45, pp. 13-18, July 2008.
渡部晋治, 堀貴明, 中村篤, “複数音響環境の発話単位遷移モデルに基づく適応学習法の検討,” 電子情報通信学会研究報告2008-SP-54, pp. 67-72, July 2008.
【招待講演】大庭隆伸, “識別的言語モデルの可能性,” 電子情報通信学会研究技術報告, 2008-SLP-72, pp. 47-50, July 2008.
吉岡　拓也, 中谷智広, 三好　正人, “雑音・残響抑圧を目的とした線形フィルタに非線形フィルタを後置させた系の最適化法,” 日本音響学会 2008年秋季研究発表会講演論文集, 3-P-35, pp. 845-846, September 2008.
吉岡　拓也, 中谷　智広, 三好　正人, “ブラインド音源分離と残響除去の統合のための一手法,” 日本音響学会 2008年秋季研究発表会講演論文集, 3-8-9, pp. 703-704, September 2008.
亀岡　弘和, 中谷　智広, 吉岡　拓也, “音声のスパース性と非負制約つき畳み込みモデルに基づくパワースペクトル領域残響除去,” 日本音響学会 2008年秋季研究発表会講演論文集, 3-8-10, pp. 705-708, September 2008.
中谷　智広, 吉岡　拓也, 木下　慶介, 三好　正人, ジュアング　ビン・ファン, “室内伝達特性の確率モデルを用いて推定された残響信号の事後分布に基づく逐次的な残響除去,” 日本音響学会 2008年秋季研究発表会講演論文集, 1-P-17, pp. 753-756, September 2008.
藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく統合的フロントエンド処理の検討,” 日本音響学会, 平成20年度秋季研究発表会, 1-1-5, pp. 11-14, September 2008.
西亀健太, 和泉洋介, 小野順貴, 西本卓也, 嵯峨山茂樹, 渡部晋治, “音声スパース性に基づく2ch BSS を用いた雑音・残響下での音声認識,” 音響学会講演論文集, 1-1-4, September 2008.
堀貴明, 須藤克仁, 大庭隆伸, 渡部晋治, 小川厚徳, 渡辺太郎, マクダーモットエリック, 塚田元, 中村篤, “「世界メディアブラウザ」－音声認識と統計翻訳に基づく多言語動画コンテンツ検索／閲覧システム－,” 音響学会講演論文集, 1-1-17, September 2008.
渡部晋治, 中村篤, “巨視的な時間発展系に基づく逐次追従型音声認識,” 音響学会講演論文集, 2-P-9, September 2008.
D. Cournapeau, T. Kawahara, S. Watanabe, and A. Nakamura, “An Application of Online VB-EM Algorithm to Voice Activity Detection,” 音響学会講演論文集, 3-Q-11, September 2008.
藤本雅清, 石塚健太郎, 中谷智広, “音声区間検出と雑音抑圧の統合法を用いた雑音下音声認識,” 電子情報通信学会, 音声研究会, SP2008-81, pp. 13-18, December. 2008.
西亀健太, 和泉洋介, 渡部晋治, 西本卓也, 小野順貴, 嵯峨山茂樹, “スパース性に基づくブラインド音源分離を用いた２チャンネル入力音声認識,” 信学技報, vol. 108, no. 338, SP2008-79, pp. 1-6, December 2008.
佐古淳, 有木康雄, 岩田具治, 渡部晋治, 堀貴明, “話題の連続/不連続変化を考慮したトピックモデルに基づく音声認識,” 信学技報, vol. 108, no. 338, SP2008-88, pp. 55-60, December 2008.
堀貴明, “10周年企画「音声言語研究関連分野の10年の歩み」～フロントエンド特徴抽出音響モデル,” 第10回音声言語シンポジウム, December 2008.
K. Kinoshita, T. Nakatani, M. Miyoshi and T. Kubota, “A new audio post-production tool for speech dereverberation,” Audio Engineering Society (AES) 125th Convention, San Francisco, 2008
岩田具治, 渡部晋治, 山田武士, 上田修功, “トピックモデルに基づくユーザ興味の追跡,” 第１１回情報論的学習理論IBIS2008, A18.
荒木, 藤本, 石塚, 中谷, 澤田, 牧野, “音声区間推定と時間周波数領域方向推定の統合による会議音声話者識別,” 電子情報通信学会技術研究報告, Vol.EA2008-40, pp 19-24, 2008.
荒木章子, 藤本雅清, 石塚健太郎, 中谷智広, 澤田宏, 牧野昭二, “音声区間推定と時間周波数領域方向推定の統合による会議音声話者識別,” 電子情報通信学会技術研究報告, EA2008-40, pp.19-24, 2008.
荒木章子, 藤本雅清, 石塚健太郎, 澤田宏, 牧野昭二, “音声区間検出と方向情報を用いた会議音声話者識別システムとその評価,” 日本音響学会講演論文集, 1-10-1, 春季, pp.1-4, 2008.
石塚健太郎, 荒木章子, 大塚和弘, 中谷智広, 藤本雅清, “音響情報と映像情報の統合による多人数会話における話者決定技術,” 情報処理学会研究報告, 2008-SLP-74, pp.25-30, 2008.
大塚, 荒木, 石塚, 藤本, 大和, “多人数会話シーン分析に向けた実時間マルチモーダルシステムの構築～マルチモーダル全方位センサを用いた顔方向追跡と話者ダイアリゼーションの統合,” 電子情報通信学会マルチメディア・仮想環境基礎研究会 (MVE), 信学技報, vol. 108, no. 328, MVE2008-68, pp. 55-62, 2008.
渡部晋治, 中村篤, “巨視的な時間発展系に基づく逐次追従型音声認識,” 第１１回情報論的学習理論IBIS2008, A22.
藤本雅清, 石塚健太郎, 中谷智広, “音声区間検出と雑音抑圧の統合法を用いた雑音下音声認識,” 情報処理学会研究報告, 2008-SLP-74, pp.13-18, 2008.
藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく音声区間検出と雑音抑圧の統合の検討,” 日本音響学会講演論文集, 1-10-9, 春季, pp.27-30, 2008.
藤本雅清, 石塚健太郎, 中谷智広, “確率モデルに基づく統合的フロントエンド処理の検討,” 日本音響学会講演論文集, 1-1-5, 秋季, pp.11-14, 2008.
木下慶介, 中谷智広, 三好正人, “Upmixing stereo music signals based on dereverberation mechanism,” Audio Engineering Society (AES) Japan conference 2008
木下慶介, 中谷智広, 三好正人, “残響除去原理に基づくステレオ音楽信号のサラウンド化,” 日本音響学会秋季研究発表会, pp.615-618, 2008
J. Muramatsu, and S. Miyake, “Construction of wiretap channel codes by using sparse matrices,” 第6回シャノン理論ワークショップ予稿集, pp. 39-44, 2008.
村松純, 三宅茂樹, “疎行列アンサンブルのハッシュ性と多端子情報源符号(招待論文),” 科研費特定領域研究「情報統計力学の深化と展開」チュートリアル, ネットワーク情報理論：「センシングと符号化」予稿集, pp. 10-20, 2008.
J. Muramatsu, and S. Miyake, “Coding theorems based on the hash property,” 第31回情報理論とその応用シンポジム予稿集, pp. 83-88, 2008.
J. Muramatsu, and S. Miyake, “Basic lemmas of a hash property,” 第31回情報理論とその応用シンポジム予稿集, pp. 77-82, 2008.
村松純, 三宅茂樹, “疎行列の符号化問題への応用(招待論文),” 電子情報通信学会研究報告, vol. IT2008-24, pp. 25-30, 2008.
S. Miyake and J. Muramatsu, “A construction of channel code, joint source-channel code, and universal code for arbitrary stationary memoryless channels using sparse matrices,” 電子情報通信学会研究報告, vol. IT2007-54, pp. 37-42, 2008.
村松純, “相関乱数からの秘密鍵共有法について(招待論文),” 電子情報通信学会研究報告, vol. IT2007-32, pp. 39-44, 2008.

2007

論文

S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined Blind Sparse Source Separation for Arbitrarily Arranged Multiple Sensors,” Signal Processing, vol. 87, pp. 1833-1847, February. 2007. doi:10.1016/j.sigpro.2007.02.003.
M. Knaak (Technical University Berlin), S. Araki and S. Makino, “Geometrically Constrained Independent Component Analysis,” IEEE Trans. Audio, Speech and Language Processing, vol. 15, No. 2, pp. 715-726, February, 2007.
T. Yamamoto, I. Oowada, H. Yip, A. Uchida, S. Yoshimori, K. Yoshimura, J. Muramatsu, S. Goto, and P. Davis, “Common-chaotic-signal induced synchronization in semiconductor lasers,” Opt. Express, vol.15, no.7, pp.3974-3980, April 2007.
H. Sawada, S. Araki, R. Mukai and S. Makino , “Grouping Separated Frequency Components with Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation,” IEEE Trans. Audio, Speech & Language Processing, vol. 15, no. 5, pp. 1592-1604, July 2007.
K. Ishizuka, R. Mugitani, H. Kato, and S. Amano, “Longitudinal developmental changes in spectral peaks of vowels produced by Japanese infants,” The Journal of the Acoustical Society of America, Vol.121, No.11, pp.2272-2282, 2007.
K. Kinoshita, T. Nakatani and M. Miyoshi, “Fast estimation of a precise dereverberation filter based on the harmonic structure of speech,” Acoustical Science and Technology (AST)
T. Yoshioka, T. Hikichi, and M. Miyoshi, “Dereverberation by using time-variant nature of speech production system,” EURASIP Journal on Advances in Signal Processing, vol. 2007, article ID 65698, doi:10.1155/2007/65698, 2007.
ソルヴァン加藤比呂子, 石塚健太郎, 藤本雅清, “AR-GARCHモデルに基いた音声区間検出法の提案,” 電子情報通信学会論文誌, Vol.J90-D, No.12, pp.3210-3220, 2007.
T. Hori, C. Hori, Y. Minami, and A. Nakamura, “Efficient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition,” IEEE Trans., Audio, Speech and Language Processing, Vol. 15, pp. 1352-1365, 2007.

書籍、解説記事

H. Sawada, S. Araki, and S. Makino, “Frequency-Domain Blind Source Separation,” in Blind Speech Separation, S. Makino T.-W. Lee and H. Sawada, Eds., Springer, 2007.
S. Araki, H. Sawada and S. Makino, “K-means based Underdetermined Blind Speech Separation,” in Blind Speech Separation, S. Makino T.-W. Lee and H. Sawada, Eds., Springer, 2007.

国際会議予稿

T. Nakatani, B.-H. Juang, T. Hikichi, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, “Study on speech dereverberation with autocorrelation codebook,” in Proceedings of the 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), vol. 1, pp. 193-196, April 2007.
M. Fujimoto, K. Ishizuka, and H. Kato, “Noise Robust Voice Activity Detection Based on Statistical Model and Parallel Non-linear Kalman filtering,” Proc. ICASSP '07, Vol. IV, pp. 797-800, April 2007.
S. Araki, H. Sawada, and S. Makino, “Blind speech separation in a meeting situation with maximum SNR beamformers,” ICASSP2007, vol. 1, pp. 41-44, April 2007.
J. Cermak, S. Araki, H. Sawada and S. Makino, “Blind Source Separation Based on Beamformer Array and Time Frequency Binary Masking,” in Proc. ICASSP2007, vol. I, pp. 145 -148, April 2007.
J. E. Rubio, K. Ishizuka, H. Sawada, S. Araki, T. Nakatani and M. Fujimoto, “Two-Microphone Voice Activity Detection Based on the Homogeneity of the Direction of Arrival Estimates,” in Proc. ICASSP2007, vol.4, pp. 385-388, April 2007.
T. Nakatani, T. Hikichi, K. Kinoshita, T. Yoshioka, M. Delcroix, M. Miyoshi, and Biing-Hwang Juang, “Robust blind dereverberation of speech signals based on characteristics of short-time speech segments,” in Proceedings of the 2007 IEEE International Symposium on Circuits and Systems (ISCAS 2007), pp. 2986-2989, May. 2007.
H. Sawada, S. Araki and S. Makino, “Measuring Dependence of Bin-wise Separated Signals for Permutation Alignment in Frequency-domain BSS,” in Proc. ISCAS2007, pp. 3247 - 3250, May 2007.
M. Fujimoto and K. Ishizuka, “Noise Robust Voice Activity Detection Based on Switching Kalman Filtering,” Proc. Eurospeech '07, pp. 2933-2936, August 2007.C
T. Oba, T. Hori, and A. Nakamura, “A Study of Efficient Discriminative Word Sequences for Reranking of Recognition Results based on N-gram Counts,” Interspeech2007, pp. 1753-1756, August 2007.
T. Yoshioka, T. Nakatani, T. Hikichi, and M. Miyoshi, “Overfitting-resistant speech dereverberation,” in Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007), pp. 163-166, October 2007.
T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, and M. Miyoshi, “Importance of energy and spectral features in Gaussian source model for speech dereverberation,” in Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007), pp. 299-302, October 2007.
Y. Minami, M. Sawaki, K. Dohsaka, R. Higashinaka, K. Ishizuka, H. Isozaki, T. Matsubayashi, M. Miyoshi, A. Nakamura, T. Oba, H. Sawada, T. Yamada, and E. Maeda, “The world of Mushrooms: Human-computer interaction prototype systems for ambient intelligence,” Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI2007), 2007.
I. Oowada, Y. Yamamoto, H. Yip, H. Arizumi, A. Uchida, S. Yoshimori, K. Yoshimura, J. Muramatsu, S. Goto, and P. Davis, “Synchronization in semiconductor lasers subject to a common a common chaotic drive signal,” Proceedings of the 15th IEEE International Workshop on Nonlinear Dynamics of Electronic Systems Tokushima, Japan, pp.149-152, 2007.
H. Sawada, S. Araki, and S. Makino, “A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures,” WASPAA2007.
H. Sawada, S. Araki, and S. Makino, “MLSP 2007 data analysis competition: Frequency-domain blind source separation for convolutive mixtures of speech/and audio,” MLSP2007, 2007.
J. E. Rubio, K. Ishizuka, H. Sawada, S. Araki, T. Nakatani, and M. Fujimoto, “Two-microphone voice activity detection based on the homogeneity of the direction of arrival estimate,” Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP2007), Vol.4, pp.385-388, 2007.
J. Muramatsu, “Effect of random permutation of symbols in a sequence,” Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, pp.1486-1490, 2007.
K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, “Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio,” Proceedings of the 10th European Conference on Speech Communication and Technology (Interspeech2007 - Eurospeech), pp.230-233, 2007.
M. Fujimoto and K. Ishizuka, “Noise robust voice activity detection based on switching Kalman filter,” Proceedings of the 10th European Conference on Speech Communication and Technology (Interspeech2007 - Eurospeech), pp.2933-2936, 2007.
M. Fujimoto, K. Ishizuka, and H. Kato, “Noise robust voice activity detection based on statistical model and parallel non-linear Kalman filtering,” Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP2007), Vol.4, pp.797-800, 2007.
R. Mugitani, T. Kobayashi, and K. Ishizuka, “Perceptual development of phonemic categories for Japanese single/geminate obstruents,” The 32nd Boston University Conference on Language Development (BUCLD32), 2007.
S. Miyake, and J. Muramatsu, “Constructions of a lossy source code using LDPC matrices,” Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, pp.1106-1110, 2007.
S. Watanabe and A. Nakamura, “Incremental adaptation based on a macroscopic time evolution system,” Proc. ICASSP 2007, vol. 4, pp. 769-772, 2007.
Y. Minami, M. Sawaki, K. Dohsaka, R. Higashinaka, K. Ishizuka, H. Isozaki, T. Matsubayashi, M. Miyoshi, A. Nakamura, T. Oba, H. Sawada, T. Yamada, and E. Maeda, “The world of Mushrooms: Human-computer interaction prototype systems for ambient intelligence,” Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI2007), pp.366-373, 2007.

その他会議予稿

吉岡　拓也, 引地　孝文, 三好　正人, “音声の非定常性を利用した残響除去方法とその閉形式の近似解,” 日本音響学会 2007年春季研究発表会講演論文集, 2-1-7, pp. 559-560, March 2007.
中谷　智広, ジュアング　ビン・ファン, 引地　孝文, 吉岡　拓也, 木下　慶介, デルクロア　マーク, 三好　正人, “自己相関コードブックに基づく音声信号の残響除去,” 日本音響学会 2007年春季研究発表会講演論文集, 2-1-8, pp. 561-562, March 2007.
藤本雅清, 石塚健太郎, 加藤比呂子, “確率モデルに基づく音声区間検出法のCENSREC-1-Cによる評価,” 日本音響学会, 平成19年度春季研究発表会, 3-9-12, pp. 87-88, March 2007.
荒木, 澤田, 牧野, “話者分類とSN比最大化ビームフォーマに基づく会議音声強調,” 日本音響学会2007年春季研究発表会, pp. 571-572, March 2007.
澤田, 荒木, 大塚, 藤本, 石塚, “多人数多マイクでの発話区間検出～ピンマイクでの事例～,” 日本音響学会2007年春季研究発表会, pp. 679-680, March 2007.
渡部晋治, 中村篤, “線形回帰行列のコース／ファイン学習による音響モデル適応,” 音響学会講演論文集, 3-10-2, March 2007.
大庭隆伸, 堀貴明, 中村篤, “認識誤りに対する各単語N-gramの関与度を考慮した誤り訂正学習,” 音響学会講演論文集, March 2007.
大庭隆伸, 渡部晋治, 石塚健太郎, 藤本雅清, 堀貴明, Erik McDermott, 南泰浩, 中村篤, “音声認識システムSOLONにおける日本語講演音声への教師なし適応に関する評価,” 音響学会講演論文集, March 2007.
藤本雅清, 石塚健太郎, 中谷智広, “音声の周期性・非周期性成分比とSwitching Kalman filterに基づく雑音下音声区間検出,” 情報処理学会研究報告, SLP-67-13, pp. 69-74, July 2007.
吉岡　拓也, 三好　正人, “変分ベイズ法による音声強調と雑音スペクトルの適応的推定,” 日本音響学会 2007年秋季研究発表会講演論文集, 3-7-6, pp. 727-728, September 2007.
中谷　智広, ジュアング　ビン・ファン, 吉岡　拓也, 木下　慶介, 三好　正人, “音声信号の残響除去におけるエネルギーおよびスペクトル特徴の重要性,” 日本音響学会 2007年秋季研究発表会講演論文集, 3-P-10, pp. 761-762, September 2007.
藤本雅清, 石塚健太郎, 中谷智広, “複数の音声特徴量及び信号識別処理の適応的統合に基づく音声区間検出,” 日本音響学会, 平成19年度秋季研究発表会, 3-3-11, pp. 163-166, September 2007.
石塚, J.E.Rubio, 澤田, 荒木, 中谷, 藤本, “信号到来方向推定の偏在性を用いた耐雑音音声区間検出法,” 日本音響学会2007年秋季研究発表会, pp. 163-166, September 2007.
木下, 中谷, 澤田, 荒木, 三好, “複数音源が存在する残響環境でのマルチステップ線形予測の効果,” 日本音響学会2007年秋季研究発表会, September 2007.
渡部晋治, 中村篤, “巨視的な時間発展系に基づくモデル適応と従来型適応との関係の考察,” 音響学会講演論文集, 2-3-12, September 2007.
堀貴明, リーハセリントン, ティモシーヘイゼン, ジェームズグラス, “コンヒュージョンネットワークを用いたオープン語彙発話検索,” 日本音響学会講演論文集, 1-3-10, September 2007.
堀貴明, 中村篤, “超大語彙音声認識による音声からのオンライン固有表現抽出,” 日本音響学会講演論文集, 2-3-2, September 2007.
大庭隆伸, 堀貴明, 中村篤, “誤り訂正モデルにおける単語誤り率基準での対立仮説選択とその効果,” 音響学会講演論文集, pp. 121-122, September 2007.
堀貴明, リーハセリントン, ティモシーヘイゼン, ジェームズグラス, “コンフュージョンネットワークを用いたオープン語彙発話検索法とその評価,” 信学技法 SP11-8, November 2007.
藤本雅清, 石塚健太郎, 中谷智広, “複数の音声区間検出法の適応的統合の検討と考察,” 電子情報通信学会, 音声研究会, SP2007-97, pp. 7-12, December 2007.
石塚, 荒木, 藤本, 瀬戸口(京大), 高梨(京大), 河原(京大), “ポスター会話に対する発話区間検出と話者識別の検討,” 情報処理学会研究報告, pp. 217-222, December 2007.
M. Delcroix, T. Nakatani, and S. Watanabe, “Dynamic feature variance adaptation for robust speech recognition with a speech enhancement pre-processor,” 電子情報通信学会技術研究報告2007-SP-105, pp.55-60, December 2007.
渡部晋治, 中村篤, “巨視的な時間発展系に基づく逐次モデル適応－モデルの逐次更新における学習データの発話数に関する考察－,” 電子情報通信学会研究報告2007-SP-130, pp.201-206, December 2007.
大庭隆伸, 堀貴明, 中村篤, “識別的誤り訂正学習における対立単語列と素性の選定,” 電子情報通信学会研究技術報告, 2007-SLP-69, pp. 235-240, December 2007.
K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Dereverberation of real recordings using linear prediction-based microphone array,” Audio Engineering Society (AES) 13th Regional Convention, Tokyo, 2007.
K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Multi-step linear prediction based speech enhancement in noisy reverberant environment,” Proc. of Interspeech, pp.854-857, 2007.
石塚健太郎, Juan Emilio Rubio, 澤田宏, 荒木章子, 中谷智広, 藤本雅清, “信号到来方向の推定値の偏りを用いた耐雑音音声区間検出法,” 日本音響学会講演論文集, 3-3-10, 秋季, pp.161-162, 2007.
石塚健太郎, 荒木章子, 藤本雅清, 瀬戸口久雄, 高梨克也, 河原達也, “ポスター会話に対する発話区間検出と話者識別の検討,” 電子情報通信学会技術研究報告NLC2007-70, SP2007-133, pp.217-222, 2007.
大庭隆伸, 渡部晋治, 石塚健太郎, 藤本雅清, 堀貴明, マクダーモットエリック, 南泰浩, 中村篤, “音声認識システムSOLONにおける日本語講演音声への教師なし適応に関する評価,” 日本音響学会講演論文集, 1-9-11, 春季, 2007.
藤本雅清, 石塚健太郎, 加藤比呂子, “確率モデルに基づく音声区間検出法のCENSREC-1-Cによる評価,” 日本音響学会講演論文集, 3-9-12, 春季, 2007.
藤本雅清, 石塚健太郎, 中谷智広, “複数の音声区間検出法の適応的統合の検討と考察,” 電子情報通信学会技術研究報告NLC2007-34, SP2007-97, pp.7-12, 2007.
藤本雅清, 石塚健太郎, 中谷智広, “複数の音声特徴量及び信号識別処理の適応的統合に基づく音声区間検出,” 日本音響学会講演論文集, 3-3-11, 秋季, pp.163-166, 2007.
麦谷綾子, 小林哲生, 石塚健太郎, 天野成昭, “幻の「っ」：日本語促音の知覚発達過程,” 日本音響学会聴覚研究会資料H2007-116, pp.667-671, 2007.
木下慶介, 中谷智広, 澤田宏, 荒木章子, 三好正人, “複数音源が存在する残響環境でのマルチステップ線形予測の効果,” 日本音響学会秋季研究発表会, pp.731-732, 2007.
澤田宏, 荒木章子, 大塚和弘, 藤本雅清, 石塚健太郎, “多人数マイクでの発話区間検出 - ピンマイクでの事例 -,” 日本音響学会講演論文集, 3-Q-15, 春季, 2007.
三宅茂樹, 村松純, “LDPC 行列を用いた通進路符号の構成,” 第5回シャノン理論ワークショップ予稿集, pp. 1-7, 2007.
大和田功, 山本徹, 葉海鵬, 有泉宏紀, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “共通カオス信号により駆動された半導体レーザーカオス同期実験,” 電子情報通信学会研究報告, 非線形問題研究会, 2007.
大和田功, 山本徹, 葉海鵬, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “半導体レーザーにおける共通カオス信号による同期の数値解析,” 応用物理学会予稿集, 講演番号 30p-J-5, 2007.
山本徹, 大和田功, 葉海鵬, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “半導体レーザーにおける共通カオス信号による同期の同期実験,” 応用物理学会予稿集, 講演番号 30p-J-5, 2007.
大和田功, 山本徹, 葉海鵬, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “半導体レーザーにおける共通カオス信号による同期の数値解析,” 電子情報通信学会全国大会予稿集, p. 55, 2007.
山本徹, 大和田功, 葉海鵬, 内田淳史, 吉森茂, 吉村和之, 村松純, 後藤振一郎, Peter Davis, “半導体レーザーにおける共通カオス信号による同期の同期実験,” 電子情報通信学会全国大会予稿集, p. 51, 2007.
吉村和之, 村松純, 後藤振一郎, Peter Davis, “共通ノイズ入力による同期現象,” 電子情報通信学会全国大会予稿集, S39-S40, 2007.
三宅茂樹, 村松純, “LDPC行列を用いた有歪み情報源符号の構成,” 電子情報通信学会研究報告, vol. IT2006-51, pp. 7-11, 2007.
村松純, 三宅茂樹, “無歪み多端子情報源符号器を利用した有歪み情報源符号化アルゴリズム,” 電子情報通信学会研究報告, vol.IT2006-50, pp. 1-6, 2007.

2006

論文

T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Common acoustical pole estimation from multi-channel musical audio signals,” IEICE Transactions on Fundamentals, vol. E89-A, no. 1, pp. 240-247, January 2006.
J. Muramatsu, “Secret key agreement from correlated source outputs using low density parity check matrices,” IEICE Transactions on Fundamentals, vol.E89-A, no.7, pp.2036-2046, July 2006.
J. Muramatsu, K. Yoshimura, and P. Davis, “Secret key capacity and advantage distillation capacity,” IEICE Transactions on Fundamentals, vol.E89-A, no.10, pp.2589-2596, October 2006.
J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Secret key capacity for optimally correlated sources under sampling attack,” IEEE Transactions on Information Theory, vol.IT-52, no.11, pp.5140-5151, November 2006.
H. Sawada, S. Araki, R. Mukai, S. Makino, “Blind extraction of dominant target sources using ICA and time-frequency masking,” IEEE Trans. Audio, Speech, and Language Processing, vol.14, no.6, pp.2165-2173, November 2006.
K. Ishizuka and T. Nakatani, “A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition,” Speech Communication, Vol.48, No.11, pp.1447-1457, 2006.
K. Ishizuka, T. Nakatani, Y. Minami, and N. Miyazaki, “Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition,” The Journal of the Acoustical Society of America, Vol.120, No.1, pp.443-452, 2006.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Frequency Domain Blind Source Separation of Many Speech Signals Using Near-field and Far-field Models,” EURASIP Journal on Applied Signal Processing, vol. 2006, Article ID 83683, 13 pages, 2006. doi:10.1155/ASP/2006/83683.
S. Watanabe and A. Nakamura, “Speech recognition based on Student's t-distribution derived from total Bayesian framework,” IEICE D-II, vol. E89-D, no. 3, pp. 970-980, 2006
S. Watanabe, A. Sako, and A. Nakamura, “Automatic Determination of Acoustic Model Topology using Variational Bayesian Estimation and Clustering for large vocabulary continuous speech recognition,” IEEE Transactions on Speech and Audio Processing, vol. 14, issue 3, pp. 855-872, 2006.
麦谷綾子, 小林哲生, 石塚健太郎, 天野成昭, 開一夫, “日本人乳児の音声口形マッチングの発達に関する母音/i/を用いた検討,” 音声研究, Vol.10, No.1, pp.96-108, 2006.

書籍、解説記事

A. Nakamura, S. Watanabe, T. Hori, E. McDermott, and S. Katagiri, “Advanced Computational Models and Learning Theories for Spoken Language Processing,” IEEE Computational Intelligence Magazine, vol. 1, issue 2, pp. 5-9, May 2006.
S. Makino, H. Sawada, R. Mukai, and S. Araki, “Blind source separation of convolutive mixtures of audio signals in frequency domain,” in Topics in Acoustic Echo and Noise Control, E. Haensler and G. Schmidt, Eds., Springer, 2006.
渡部晋治, “ベイズ法による音声認識,” 日本音響学会誌62巻8号, pp. 599-604, 2006.

国際会議予稿

T. Yoshioka, T. Hikichi, and M. Miyoshi, “Second-order statistics based dereverberation by using nonstationarity of speech,” in Proceedings of the 2006 International Workshop on Acoustic Echo and Noise Control (IWAENC 2006), CD-ROM Proceedings, September 2006.
T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Robust decomposition of inverse filter of channel and prediction error filter of speech signal for dereverberation,” in Proceedings of the 2006 European Signal Processing Conference (EUSIPCO 2006), CD-ROM Proceedings, September 2006.
T. Oba, T. Hori, and A. Nakamura, “Sentence Boundary Detection Using Sequential Dependency Analysis Combined with CRF-based Chunking,” ICSLP2006, pp. 284-289, September 2006.
H. Sawada, S. Araki, R. Mukai and S. Makino, “Blind separation and localization of speeches in a meeting situation,” Asilomar 2006, pp. 1407-1411, October 2006.
R. Mukai, H. Sawada, S. Araki and S. Makino, “Frequency Domain Blind Source Separation in a Noisy Environment,” Joint meeting of ASA and ASJ 2006, November 2006, (invited).
H. Kato, Y. Nagahara, S. Araki, H. Sawada and S. Makino, “Parametric Pearson Approach based Independent Component Analysis for Frequency Domain Blind Speech Separation,” EUSIPCO2006, 2006.
H. Sawada, S. Araki, R. Mukai and S. Makino, “On Calculating the Inverse of Separation Matrix in Frequency-Domain BSS,” ICA2006, pp. 691-699, 2006.
H. Sawada, S. Araki, R. Mukai and S. Makino, “Solving the permutation problem of frequency-domain BSS when spatial aliasing occurs with wide sensor spacing,” ICASSP2006, vol. 5, pp. 77-80, 2006.
J. Cermak, S. Araki, H. Sawada and S. Makino, “Blind Speech Separation by Combining Beamformers and a Time Frequency Binary Mask,” IWAENC2006, 2006.
J. Cermak, S. Araki, H. Sawada and S. Makino, “Musical Noise Reduction in Time-frequency-binary-masking-based Blind Source Separation Systems,” 16th Czech-German Workshop, 2006.
J. Muramatsu, K. Yoshimura, and P. Davis, “Secret key capacity and advantage distillation capacity,” Proceedings of the 2006 IEEE International Symposium on Information Theory, pp.2147-2151, 2006.
J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Some results on secret key agreement from correlated sources,” Proceedings of the 5th Asian-European Workshop on Information Theory, Jeju, Korea, pp.10-13, 2006.
K. Ishizuka and H. Kato, “A feature for voice activity detection derived from speech analysis with the exponential autoregressive model,” Proceedings of the 31st International Conference on Acoustics, Speech, and Signal Processing (ICASSP2006), Vol.1, pp.789-792, 2006.
K. Ishizuka and Tomohiro Nakatani, “Study of noise robust voice activity detection based on periodic compnent to aperiodic component ratio,” Proceedings of ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA2006), pp.65-70, 2006.
K. Yoshimura, J. Muramatsu, and P. Davis, “Conditions for consistency in time-delay systems,” Proceedings of the International Workshop on Synchronization Phenomena and Analyses, p.135, 2006.
K. Yoshimura, J. Muramatsu, and P. Davis, “Consistency in time-delay systems with periodic feedback functions,” Proceedings of the 2006 International Symposium on Nonlinear Theory and its Applications, pp.287-290, 2006.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Blind Source Separation of Many Signals in the Frequency Domain,” ICASSP2006, vol.5, pp.969-972, 2006.
S. Araki, H. Sawada, R. Mukai and S. Makino, “Blind sparse source separation with spatially smoothed time-frequency masking,” IWAENC2006, 2006.
S. Araki, H. Sawada, R. Mukai and S. Makino, “Performance evaluation of sparse source separation and DOA estimation with observation vector clustering in reverberant environments,” IWAENC2006, 2006.
S. Araki, H. Sawada, R. Mukai and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” ICASSP2006, vol. 5, pp. 33-36, 2006.
S. Araki, H. Sawada, R. Mukai and S. Makino, “Normalized Observation Vector Clustering Approach for Sparse Source Separation,” EUSIPCO2006, (invited).
S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined Sparse Source Separation of Convolutive Mixtures with Observation Vector Clustering,” ISCAS2006, pp. 3594-3597, 2006.
S. Mizutani, J. Muramatsu, K. Arai, and P. Davis, “Noise-assisted quantization,” Proceedings of the 2006 International Symposium on Nonlinear Theory and its Applications, pp.843-846, 2006.
S. Watanabe and A. Nakamura, “Acoustic model adaptation based on coarse/fine training of transfer vector using directional statistics,” Proc. ICASSP 2006 , vol. 1, pp. 1005-1008, 2006.
T. Hori and A. Nakamura, “An extremely large vocabulary approach to named entity extraction from speech,” in Proc. ICASSP2006, Vol. 1, pp. 973-976, 2006.
T. Hori, I. L. Hetherington, T. J. Hazen, and J. R. Glass, “Open-vocabulary spoken utterance retrieval using confusion networks,” in Proc. ICASSP2007, Vol. 1, pp. 973-976, 2006.

その他会議予稿

渡部晋治, 中村篤, “方向統計を用いた移動ベクトルのコース／ファイン学習に基づく音響モデル適応,” 音響学会講演論文集, 1-11-24, March 2006.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, Mike Schuster, Erik McDermott, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパス(公開版Ver.1.0)による評価,” 音響学会講演論文集, 2-8-7, March 2006.
藤本雅清, 石塚健太郎, 加藤比呂子, “音声／非音声状態遷移モデルに基づく音声区間検出,” 日本音響学会, 平成18年度春秋研究発表会, 1-2-17, pp. 33-34, September 2006.
渡部晋治, 中村篤, “確率分布の巨視的な時間発展システムに基づく逐次モデル適応,” 音響学会講演論文集, 2-2-10, September, 2006.
大庭隆伸, 堀貴明, 中村篤, “チャンキングと逐次的係り受け解析に基づく話し言葉の文境界検出,” 日本音響学会講演論文集, 2-2-5, September 2006.
吉岡　拓也, 引地　孝文, 三好　正人, “音声生成系と室内伝達系逆フィルタの同時推定について,” 日本音響学会関西支部第9回若手研究者交流研究発表会, 口頭発表, December 2006.
藤本雅清, 石塚健太郎, 加藤比呂子, “音声と雑音両方の状態遷移過程を有する雑音下音声区間検出,” 電子情報通信学会, 音声研究会, SP2006-87, pp. 13-18, December 2006.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 藤本雅清, 堀貴明, Erik McDermott, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパスによる評価（2006年版）,” 情報処理学会研究報告2006-SLP-64, pp.251-256, December 2006.
西村竜一, 秋田祐哉, 須藤克仁, 大庭隆伸, “（サーベイ）ICSLPにおける研究動向 -言語モデル・対話システムを中心に-,” 電子情報通信学会研究技術報告, 2006-SLP-64, pp. 239-244, December 2006.　（外部からの依頼）
K. Kinoshita, T. Nakatani and M. Miyoshi, “Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation,” Proc. Of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), I, pp.817-820
加藤, 永原(明治大), 荒木, 澤田, 牧野, “パラメトリックピアソン分布を用いた周波数領域ブラインド音源分離,” 日本音響学会2006年春季研究発表会, pp, 549-550, 2006.
加藤比呂子, 石塚健太郎, “GARCHモデルを用いた音声区間検出手法の提案,” 日本音響学会講演論文集, 2-1-19, pp.107-108, 春季, 2006.
吉村和之, 村松純 “相関乱数を利用した暗号(招待論文),” 第67回応用物理学会学術講演会講演予稿集, p.51, 2006.
荒木, 澤田, 向井, 牧野, “観測信号ベクトルのクラスタリングに基づくスパース信号の到来方向推定,” 日本音響学会2006年春季研究発表会, pp. 615-616, 2006.
石塚健太郎, 加藤比呂子, “Exponential自己回帰モデルを用いた音声区間検出,” 日本音響学会講演論文集, 2-1-20, pp.109-110, 春季, 2006.
石塚健太郎, 中谷智広, “信号の周期性・非周期性成分の比を用いた耐雑音音声区間検出の評価,” 日本音響学会講演論文集, 3-9-11, 春季, 2006.
石塚健太郎, 中谷智広, “信号の周期性成分・非周期性成分の比を用いた耐雑音音声区間検出,” 日本音響学会講演論文集, 1-2-18, pp.35-36, 秋季, 2006.
村松純, “系列順序のランダムな置換についての再考,” 第29回情報理論とその応用シンポジム予稿集, pp. 263-266, 2006.
村松純, “相関情報を利用した秘密情報共有法に対する符号化定理(招待論文),” 2006年電子情報通信学会基礎・境界ソサイエティ大会講演論文集, AT-1-3, 2006.
吉村和之, 村松純, “相関乱数を利用した暗号(招待論文),” 第67回応用物理学会学術講演会講演予稿集, p. 51, 2006.
中村亜希子, 本間章浩, 相川清明, 石塚健太郎, “雑音中の周波数変化音の検知,” 日本音響学会講演論文集, 2-3-3, pp.443-444, 春季, 2006.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 藤本雅清, 堀貴明, エリック・マクダーモット, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパスによる評価（2006年版）,” 情報処理学会研究報告2006-SLP-64, pp.251-256, 2006.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, シュスターマイク, マクダーモットエリック, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパス（公開版Ver.1.0）による評価,” 日本音響学会講演論文集, 3-1-1, pp.1185-1186, 春季, 2006.
藤本雅清, 石塚健太郎, 加藤比呂子, “音声／非音声状態遷移モデルに基づく音声区間検出,” 日本音響学会講演論文集, 1-2-17, pp.33-34, 秋季, 2006.
藤本雅清, 石塚健太郎, 加藤比呂子, “音声と雑音両方の状態遷移過程を有する雑音下音声区間検出,” 情報処理学会研究報告2006-SLP-64, pp.13-18, 2006.
藤本雅清, 石塚健太郎, 中谷智広, “音声の周期性・非周期性成分比とSwitching Kalman filterに基づく雑音下音声区間検出,” 情報処理学会研究報告2007-SLP-67, pp.69-74, 2006.
木下慶介, デルクロア・マーク, 中谷智広, 三好正人, “実音場収音した音声による「マルチステップ線形予測に基づく残響除去方法」の評価,” 日本音響学会秋季研究発表会, pp.421-422, 2006.
木下慶介, 中谷智広, 三好正人, “マルチステップ線形予測を用いた１ｃｈ残響除去法の検討,” 日本音響学会春季研究発表会, pp.511-512, 2006.

2005

論文

A. Blin, S. Araki, and S. Makino, “Underdetermined blind separation of convolutive mixtures of speech using time-frequency mask and mixing matrix estimation,” IEICE Trans. Fundamentals, Vol.E88-A, No.7, pp.1693-1700, 2005.
H. Sawada, R. Mukai, S. Araki, and S. Makino, “Estimating the number of sources using independent component analysis,” Acoustical Science and Technology, vol. 26, no. 5, pp.450-452, 2005.
K. Kinoshita, T. Nakatani and M. Miyoshi, “Harmonicity based dereverberation for improving automatic speech recognition performance and speech intelligibility,” IEICE,2005.
S. Araki, S. Makino, R. Aichner(Univ. Erlangen-Nuremberg), T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “Subband-based Blind Separation for Convolutive Mixtures of Speech,” IEICE Trans. Fundamentals, E88-A(12), pp. 3593-3603, 2005.
S. Makino, H. Sawada, R. Mukai, and S. Araki, “Blind source separation of convolutive mixtures of speech in frequency domain,” IEICE Trans. Fundamentals, Vol.E88-A, No.7, pp.1640-1655, 2005. (invited)

書籍、解説記事

S. Araki, S. Makino, “Subband Based Blind Source Separation,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp. 329-352, Springer, March 2005.
H. Sawada, R. Mukai, S. Araki and S. Makino, “Frequency-domain blind source separation,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp.299-327, Springer, March 2005.
R. Mukai, H. Sawada, S. Araki and S. Makino, “Real-time blind source separation for moving speech signals,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp.353-369, Springer, March 2005.

国際会議予稿

S. Araki, S. Makino, H. Sawada and R. Mukai, “Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask,” ICASSP2005, vol. III, pp. 81-84, March 2005.
S. Araki, S. Makino, H. Sawada, and R. Mukai, “Source extraction from speech mixtures with null-directivity pattern based mask,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp. d1-d2, March 2005.
H. Sawada, S. Araki, R. Mukai, S. Makino, “Blind Extraction of a Dominant Source Signal from Mixtures of Many Sources,” ICASSP2005, vol. III, pp. 61-64, March 2005.
H. Sawada, R. Mukai, S. Araki, and S. Makino, “Frequency-domain blind source separation without array geometry information,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp.d13-d14, March 2005.
R. Mukai, H. Sawada, S. Araki, and S. Makino, “Blind source separation and {DOA} estimation using small 3-D microphone array,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp. d9-d10, March 2005.
M. Schuster and T. Hori, “Efficient generation of high-order context-dependent weighted finite state transducers for speech recognition,” in Proc. ICASSP2005, Vol I, pp. 201-204, March 2005.
T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Blind estimation of room resonances using popular, classical, and ｊazz Music,” in Proceedings of the 118th Audio Engineering Society Convention (AES 118), article ID 6632, May. 2005
H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extraction of a dominant source from mixtures of many sources using ICA and time-frequency masking,” Proc. of 2005 IEEE International Symposium on Circuits and Systems (ISCAS 2005), pp. 5882-5885, May 2005.
H. Sawada, R. Mukai, S. Araki, and S. Makino, “Multiple source localization using independent component analysis,” Proc. of 2005 IEEE AP-S International Symposium and USNC/URSI National Radio Science Meeting, July 2005.
H. Kato, Y. Nagahara (Meiji Univ.), S. Araki, and H. Sawada, “Pearson distribution system applied to blind speech separation,” 25th European Meeting of Statisticians (EMS2005), p.394, July 2005.
T. Hori and A. Nakamura, “Generalized fast on-the-fly composition algorithm for WFST-based speech recognition,” in Proc. Interspeech2005-Eurospeech, pp. 557-560, September 2005.
M. Schuster, T. Hori, and A. Nakamura, “Experiments with Probabilistic Principal Component Analysis in LVCSR,” in Proc. Interspeech2005-Eurospeech, pp. 1685-1688, September 2005.
R. Mukai, H. Sawada, S. Araki, and S. Makino, “Blind Source Separation of 3-D Located Many Speech Signals,” in Proc. WASPAA2005, pp. 9-12, October 2005.
T. Oba, T. Hori, and A. Nakamura, “Dependency modeling for integrated spontaneous speech processing,” in Proc. ASRU2005, pp. 284-289, November 2005.
M. Schuster, and T. Hori, “Construction of weighted finite state transducers for very wide context-dependent acoustic models,” in Proc. ASRU2005, pp. 162-167, November 2005.
T. Oba, T. Hori, and A. Nakamura, “Sequential Dependency Analysis for Spontaneous Speech Understanding,” ASRU2005, pp. 284-289, November 2005.
F. Flego, S. Araki, H. Sawada, T. Nakatani, and S. Makino, “Underdetermined blind separation for speech in real environments with F0 adaptive comb filtering,” IWAENC2005, pp. 93-96, 2005.
H. Sawada, R. Mukai, S. Araki, and S. Makino, “Real-time blind extraction of dominant target sources from many background interferences,” IWAENC2005, pp. 73-76, 2005.
K. Ishizuka and T. Nakatani, “Robust speech feature extraction using subband based periodicity and aperiodicity decomposition in the frequency domain,” Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2005), pp.a13-a14, 2005.
K. Ishizuka, H. Kato, and T. Nakatani, “Speech signal analysis with exponential autoregressive model,” Proceedings of the 30th International Conference on Acoustics, Speech, and Signal Processing (ICASSP2005), Vol.1, pp.225-228, 2005.
K. Ishizuka, R. Mugitani, H. Kato, and S. Amano, “A longitudinal analysis of the spectral peaks of vowels for a Japanese infant,” Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech2005 - Eurospeech) pp.1169-1172, 2005.
R. Mugitani, K. Ishizuka, and S. Amano, “Longitudinal development of mora-timed rhythmic structure in Japanese,” The 30th Boston University Conference on Language Development BUCLD30, p.52, 2005.
R. Mukai, H. Sawada, S. Araki, and S. Makino, “Real-Time Blind Source Separation and DOA Estimation Using Small 3-D Microphone Array,” IWAENC2005, pp. 45-48, 2005.
S. Araki, H. Sawada, R. Mukai and S. Makino, “A novel blind source separation method with observation vector clustering,” , IWAENC2005, pp.117-120, 2005.
S. Watanabe and A. Nakamura, “Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition,” Proc. Interspeech '2005 Eurospeech, pp. 1105-1108, 2005.

その他会議予稿

吉岡　拓也, 引地　孝文, 三好　正人, 駒谷　和範, 尾形　哲也, 奥乃　博, “音楽音響信号による室内音場共振周波数のブラインド推定,” 日本音響学会 2005年春季研究発表会講演論文集, 3-6-9, pp. 501-502, March 2005.
渡部晋治, 堀貴明, “渡部晋治, 堀貴明: HMM状態－単語の同時確率を用いた音声言語処理のための複雑度指標,” 音響学会講演論文集, 1-5-23, pp. 45-46, March 2005.
Mike Schuster, 堀貴明, 中村篤, “Experiments with Probabilistic Principal Component Analysis on the CSJ database,” 日本音響学会講演論文集, 2-Q-1, March 2005.
渡部晋治, 中村篤, “Studentのt分布を用いたベイズ予測識別の音声認識における効果,” 音響学会講演論文集, 2-7-17, September 2005.
Mike Schuster, 堀貴明, “Construction of finite state transducers for very wide context-dependent acoustic model using the CSJ database,” 日本音響学会講演論文集, 2-1-6, September 2005.
堀貴明, 中村篤, “ WFSTの拡張高速on-the-fly合成法による音声認識,” 日本音響学会講演論文集, 2-1-7, September 2005.
大庭隆伸, 堀貴明, 中村篤, "自然発話理解のための逐次的係り受け解析,” 音響学会講演論文集, September 2005.
渡部晋治, 南泰浩, 中村篤, 上田修功, “[依頼講演] 変分ベイズを用いた音声認識,” 第8回情報論的学習理論ワークショップ予稿集, pp. 269-274, November 2005.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, Mike Schuster, Erik McDermott, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパス(公開版Ver.1.0)による評価,” 音声言語処理研究報告 No.59, pp. 97-102, December 2005.
石塚健太郎, 中谷智広, “音声特徴抽出法SPADEを用いたフロントエンドの耐雑音標準コーパスによる評価,” 電子情報通信学会研究技術報告SP2005-122/NLC2005-73, pp.71-72, December 2005.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, Mike Schuster, Erik McDermott, 南泰浩, “音声認識システムＳＯＬＯＮの日本語話し言葉コーパス(公開版Ver1.0)による評価,” 第７回音声言語シンポジウム, December 2005.
大庭隆伸, 堀貴明, 中村篤, “自然発話音声のための逐次的係り受け解析,” 音響学会関西支部第８回若手研究者交流研究発表会, December 2005.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, Mike Schuster, Erik McDermott, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパス（公開版Ver.1.0）によ評価,” 電子情報通信学会研究技術報告SP2005-106/NLC2005-73, pp.7-12, December 2005.
K. Kinoshita, T. Nakatani and M. Miyoshi, “ Fast estimation of a precise dereverberation filter based on speech harmonicity,” Proc. Of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), 2005
K. Kinoshita, T. Nakatani and M. Miyoshi, “Efficient blind dereverberation framework for automatic speech recognition,” Proc. of Interspeech, 2005
加藤, 永原(明治大), 荒木, 澤田, 牧野, “パラメトリックピアソン分布を用いた周波数領域ブラインド音源分離,” 日本音響学会2005年秋季研究発表会, pp, 593-594, 2005.
向井, 澤田, 荒木, 牧野, “3次元マイクロホンアレイを用いた多音源ブラインド分離,” 信学会ソサイエティ大会, p. 209, 2005.
荒木, 澤田, 向井, 牧野, “観測ベクトルのクラスタリングによるブラインド音源分離,” 信学会ソサイエティ大会, p. 208, 2005.
荒木, 澤田, 向井, 牧野, “観測信号ベクトル正規化とクラスタリングによる音源分離手法とその評価,” 日本音響学会2005年秋季研究発表会, pp. 591-592, 2005.
石塚健太郎, 加藤比呂子, 中谷智広, “Exponential自己回帰モデルを用いた音声信号分析方法,” 日本音響学会講演論文集, 2-1-16, pp.235-236, 春季, 2005.
石塚健太郎, 中谷智広, “音声特徴抽出法SPADEを用いた耐雑音フロントエンド,” 日本音響学会講演論文集, 2-8-3, pp.63-64, 秋季, 2005.
石塚健太郎, 麦谷綾子, 加藤比呂子, 天野成昭, “F1-F2平面上における成人母音からの距離に基づく幼児母音の月齢変化,” 日本音響学会講演論文集, 3-8-3, pp.449-450, 秋季, 2005.
石塚健太郎, 麦谷綾子, 天野成昭, “乳幼児の母音に対する周波数ピークの縦断的分析,” 日本音響学会講演論文集, 2-2-7, pp.335-336, 春季, 2005.
中村篤, 大庭隆伸, 渡部晋治, 石塚健太郎, 堀貴明, マイク・シュスター, エリック・マクダーモット, 南泰浩, “音声認識システムSOLONの日本語話し言葉コーパス（公開版Ver.1.0）による評価,” 電子情報通信学会研究技術報告SP2005-106/NLC2005-73, pp.7-12, 2005.
渡部晋治, 南泰浩, 中村篤, 上田修功 “[論文賞受賞記念講演] ベイズ的基準を用いた状態共有型HMM構造の選択,” 電子情報通信学会技術報告, SP2004-149, pp. 25-30 2005.
木下慶介, 中谷智広, 三好正人, “音声のスパース性を用いる１チャネルブラインド残響除去,” 日本音響学会秋季研究発表会, 2005.
澤田, 荒木, 向井, 牧野, “多くの背景音からの主要音源のブラインド抽出,” 信学会ソサイエティ大会, p. 210, 2005.

2004

論文

R. Mukai, S. Araki, H. Sawada, S. Makino, “Evaluation of Separation and Dereverberation Performance in Frequency Domain Blind Source Separation,” Acoustical Science and Technology, Vol.25, No.2, pp.119-126, March. 2004.
H. Sawada, R. Mukai, S. Araki, S. Makino, “Convolutive Blind Source Separation for more than Two Sources in the Frequency Domain,” Acoustical Science and Technology, the Acoustical Society of Japan, vol.25, no.4, pp. 296-298, July 2004.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Blind Source Separation for Moving Speech Signals using Blockwise ICA and Residual Crosstalk Subtraction,” IEICE Trans. Fundamentals, Special Section on Digital Signal Processing, vol.E87-A, no.8, pp.1941-1948, August, 2004.
H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust and Precise Method for Solving the Permutation Problem of Frequency-Domain Blind Source Separation,” IEEE Trans. Speech and Audio Processing, vol.12, no.5, pp.530-538, September 2004.
S. Watanabe and A. Nakamura, “Acoustic model adaptation based on coarse/fine training of transfer vectors,” (in Japanese), 情報科学技術レターズ.
S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Variational Bayesian Estimation and Clustering for Speech Recognition,” IEEE Transactions on Speech and Audio Processing, vol. 12, pp. 365-381, 2004.

書籍、解説記事

牧野昭二荒木章子, 向井良, 澤田宏, “畳込み混合のブラインド音源分離,” システム/制御/情報, vol.48, no.10, pp.401-408, 2004.
堀貴明, 塚田元, “音声情報処理の最先端「重み付き有限状態トランスデューサによる音声認識」,” 情報処理学会誌「情報処理」45巻10号, pp.1020--1026, October 2004.

国際会議予稿

S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined Blind Separation for Speech in Real Environments with Sparseness and ICA,” ICASSP2004, vol. III, pp. 881-884, May 2004 (invited).
A. Blin, S. Araki and S. Makino, “A Sparseness-Mixing Matrix Estimation (SMME) Solving the Underdetermined BSS for Convolutive Mixtures,” ICASSP2004, vol. IV, pp. 85-88, May 2004.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Near-Field Frequency Domain Blind Source Separation for Convolutive Mixtures,” ICASSP2004, vol. IV, pp. 49-52, May 2004.
H. Sawada, R. Mukai, S. Araki, S. Makino, “Convolutive Blind Source Separation for more than Two Sources in the Frequency Domain,” ICASSP2004, vol. III, pp. 885-888, May 2004 (invited).
S. Makino, S. Araki, R. Mukai, and H. Sawada, “Audio source separation based on independent component analysis,” in Proc. ISCAS2004 (International Symposium on Circuits and Systems), vol. V, pp. 668-671, May 2004 (invited).
R. Mukai, H. Sawada, S. Araki and S. Makino, “Frequency Domain Blind Source Separation using Small and Large Spacing Sensor Pairs,” ISCAS2004, vol. V, pp. 1-4, May 2004.
S. Araki, S. Makino, H. Sawada and R. Mukai, “Underdetermined Blind Speech Separation with Directivity Pattern based Continuous Mask and ICA,” EUSIPCO2004, pp.1991-1994, September 2004.
S. Araki, S. Makino, H. Sawada and R. Mukai, “Underdetermined Blind Separation of Convolutive Mixtures of Speech with Directivity Pattern based Mask and ICA,” ICA2004, pp.898-905, September 2004.
H. Sawada, S. Winter, S. Araki, R. Mukai, S. Makino, “Estimating the Number of Sources for Frequency-Domain Blind Source Separation,” ICA2004 (5th International Conference on Independent Component Analysis and Blind Signal Separation), pp.610-617, September 2004.
S. Winter, H. Sawada, S. Araki, S. Makino, “Overcomplete BSS for convolutive mixtures based on hierarchical clustering,” ICA2004, pp.652-660, September 2004.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Frequency Domain Blind Source Separation for Many Speech Signals,” ICA2004, pp.461-469, September 2004.
S. Winter, H. Sawada, S. Araki, S. Makino, “Hierarchical Clustering Applied to Overcomplete BSS for Convolutive Mixtures,” SAPA2004 (ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing), Session I-3, October 2004.
A. Blin, S. Araki, and S. Makino, “Underdetermined blind source separation for convolutive mixtures exploiting a sparseness-mixing matrix estimation (SMME),” in Proc. ICA2004 (International Congress on Acoustics), vol. IV, pp. 3139-3142, 2004.
H. Sawada, R. Mukai, S. Araki, S. Makino, “Solving the Permutation and the Circularity Problem of Frequency-Domain Blind Source Separation,” ICA2004 (International Congress on Acoustics), vol. I, pp. 89-92, 2004 (invited).
K. Ishizuka and N. Miyazaki, “Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition," Proceedings of the 29th International Conference on Acoustics, Speech, and Signal Processing (ICASSP2004), Vol.1, pp.141-144, 2004.
K. Ishizuka and N. Miyazaki, “Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition," The 2nd NTT Workshop on Communication Scene Analysis (CSA2004) Poster presentation, 2004.
K. Ishizuka, N. Miyazaki, T. Nakatani and Y. Minami, “mprovement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition," Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech2004 - ICSLP), Vol.2, pp.937-940, 2004.
P. Zolfaghari, H. Kato, S. Watanabe and S. Katagiri, “Speech Spectral Modelling using Mixture of Gaussians,” Proc. SWIM , 2004
P. Zolfaghari, S. Watanabe, A. Nakamura and S. Katagiri, “Bayesian Modelling of the Speech Spectrum Using Mixture of Gaussians,” Proc. ICASSP'04, vol. 1, pp. 553-556, 2004.
R. Mukai, H. Sawada, S. Araki, S. Makino, “A Solution for the Permutation Problem in Frequency Domain BSS using Near- and Far-field Models,” ICA2004 (International Congress on Acoustics), vol. IV, pp. 3135-3138, 2004.
S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined blind separation of convolutive mixtures of speech by combining time-frequency masks and ICA,” in Proc. ICA2004 (International Congress on Acoustics), vol. I, pp.321-324, 2004.
S. Watanabe and A. Nakamura, “Acoustic model adaptation based on coarse-fine training of transfer vectors and its application to speaker adaptation task,” Proc. ICSLP'04 , vol. 4, 2933-2936, 2004.
S. Watanabe and A. Nakamura, “Robustness of acoustic model topology determined by Variational Bayesian Estimation and Clustering for speech recognition for different speech data sets,” Proc. Workshop on statistical modeling approach for speech recognition - Beyond HMM, pp. 55-60, 2004.
S. Watanabe, A. Sako (Ryukoku Univ.) and A. Nakamura, “Automatic Determination of Acoustic Model Topology using Variational Bayesian Estimation and Clustering,” Proc. ICASSP'04, vol. 1, pp. 813-816, 2004.
T. Hori, C. Hori, and Y. Minami, “Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous-speech recognition,” in Proc. ICSLP2004, Vol. 1, pp. 289-292, 2004.

その他会議予稿

渡部晋治, 佐古淳 (龍谷大学), 中村篤, “ベイズ的音声認識VBECを用いた音響モデル構造の自動決定,” 音響学会講演論文集, 1-8-6, pp. 11-12, March 2004.
渡部晋治, 堀貴明, Erik McDermott, 南泰浩, 中村篤, “音声認識システムSOLONの日本語話し言葉コーパスにおける評価,” 音響学会講演論文集, 2-8-7, pp. 73-74, March 2004.
木下慶介, 中谷智広, 三好正人, “調波構造を用いた残響除去法の明瞭性と認識率による音声品質評価,” 日本音響学会春季研究発表会, pp.611-612, March 2004.
堀貴明, 南泰浩, “有限状態トランスデューサ型デコーダの性能改善,” 日本音響学会講演論文集, 3-8-5, March 2004.
渡部晋治, 中村篤, “移動ベクトルのコース/ファイン学習にもとづく音響モデルの教師付き適応,” 音響学会講演論文集, 2-4-11, pp. 107-108, September 2004.
堀貴明, 堀智織, 南泰浩, “WFSTの高速 on-the-fly合成による超大語彙連続音声認識", 日本音響学会講演論文集, 3-1-25, September 2004.
Mike Schuster, 堀貴明, “Evaluation of beyond triphone order context-dependent models on spontaneous Japanese,” 日本音響学会講演論文集, 3-1-26 September 2004.
M. Schuster and T. Hori, “Efficient generation of high-order context-dependent weighted finite state transducers for speech recognition,” 第６回音声言語シンポジウム, December 2004.
H. Sawada, R. Mukai, S. Araki, S. Makino, “Blind Source Separation for Convolutive Mixtures in the Frequency Domain,” CSA2004.
K. Kinoshita, T. Nakatani and M. Miyoshi, “Improving automatic speech recognition performance and speech intelligibility with harmonicity based dereverberation,” Proc. Of Interspeech, 2004
K. Kinoshita, T. Nakatani and M. Miyoshi, “Speech dereverberation based on harmonic structure using a single microphone,” Poster presentation at 2004 NTT Workshop on Communication Scene Analysis, 2004
R. Mukai, H. Sawada, S. Araki, S. Makino, “A Solution for the Permutation Problem in Frequency Domain BSS using Near- and Far-field Models,” CSA2004.
S. Araki, S. Makino, H. Sawada and R. Mukai, “Blind Separation of More Speech than Sensors using Time-frequency Masks and ICA,” Proceedings of 2004 NTT Workshop on Communication Scene Analysis (CSA2004), (invited)
S. Winter, H. Sawada,S. Araki, S. Makino, “Underdetermined Blind Source Separation for Convolutive Mixtures of Sparse Signals,” CSA2004
向井, 澤田, 荒木, 牧野, “狭間隔・広間隔の複数マイクロホン対を用いた周波数領域ブラインド音源分離,” 日本音響学会2004年春季研究発表会講演論文集, pp. 627-628, 2004.
石塚健太郎, 宮崎昇, 中谷智広, 南泰浩, “音声特徴抽出法SPADEにおける歪補正法の効果,” 日本音響学会講演論文集, 3-1-4, pp.117-118, 秋季, 2004.
渡部晋治, “[チュートリアル講演] ベイズ法を用いた音声認識,” 電子情報通信学会技術研究報告, SP2004-74, pp. 13-20, 2004.
堀貴明, 渡部晋治, Erik McDermott, 南泰浩, 中村篤, “音声認識システムSOLONの日本語話し言葉コーパスによる評価,” 話し言葉の科学と工学ワークショップ講演予稿集, pp.85-92, 2004.
澤田, 向井, 荒木, 牧野, “独立成分分析を用いた音源数推定法,” 日本音響学会2004年秋季研究発表会講演論文集, pp. 753-754, 2004.

2003

論文

H. Sawada, R. Mukai, S. Araki, S. Makino, “Polar Coordinate based Nonlinear Function for Frequency Domain Blind Source Separation,” IEICE Trans. Fundamentals, vol.E86-A, no.3, pp. 590-596, March 2003.
S. Araki, R. Mukai, S. Makino, T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “The Fundamental Limitation of Frequency Domain Blind Source Separation for Convolutive Mixtures of Speech,” IEEE Trans. Speech Audio Processing, Vol. 11, No. 2, pp. 109-116, 2003.
S. Araki, S. Makino, Y. Hinamoto(NAIST), R. Mukai, T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamforming for Convolutive Mixtures,” EURASIP Journal on Applied Signal Processing, vol. 2003, no. 11, pp. 1157-1166, 2003.
渡部晋治, 南泰浩, 中村篤, 上田修功, “ベイズ的アプローチに基づく状態共有型HMM構造の選択,” 電子情報通信学会論文誌 D-II Vol. 86, No.6, pp. 776-786, 2003

国際会議予稿

R. Mukai, H. Sawada, S. Araki, S. Makino, “Real-Time Blind Source Separation for Moving Speakers using Blockwise ICA and Residual Crosstalk Subtraction,” ICA2003, pp. 975-980, April 2003.
H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust and Precise Method for Solving the Permutation Problem of Frequency-Domain Blind Source Separation,” ICA 2003, pp. 505-510, April 2003.
R. Mukai, H. Sawada, S. Araki, S. Makino, “Robust Real-Time Blind Source Separation for Moving Speakers in a Room,” ICASSP2003, pp. 469-472, April 2003.
H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust Approach to the Permutation Problem of Frequency-Domain Blind Source Separation,” ICASSP 2003, pp. 381-384, April 2003.
T. Hori, D. Willett, and Y. Minami, “Language model adaptation using WFST-based speaking-style translation,” in Proc. ICASSP2003, Vol. 1, pp. 228-231, April 2003.
C. Hori, T. Hori, H. Isozaki, E. Maeda, S. Katagiri, and S. Furui, “Deriving Disambiguous Queries in a Spoken Interactive ODQA System,” in Proc. ICASSP2003, Vol.1, pp. 384-387 April, 2003.
T. Hori, D. Willett, and Y. Minami, “Paraphrasing spontaneous speech using weighted finite-state transducers,” in Proc. SSPR2003, pp.219-222 April, 2003.
C. Hori, T. Hori, H. Isozaki, E. Maeda, S. Katagiri, and S. Furui, “Study on Spoken Interactive Open Domain Question Answering,” in Proc. SSPR2003, pp.111-113 April, 2003.
S. Araki, S. Makino, H. Sawada, A. Blin and R. Mukai, “Underdetermined Blind Separation of Convolutive Mixtures of Speech with Binary Masks and ICA,” NIPS 2003 workshop on ICA: Sparse Representations in Signal Processing, December, 2003. (We did not have the proceedings in the workshop).
A. Blin, S. Araki and S. Makino, “Blind Source Separation when Speech Signals Outnumber Sensors using a Sparseness-Mixing Matrix Combination,” IWAENC2003, pp. 211-214, 2003.
H. Sawada, R. Mukai, S. de la Kethulle, S. Araki and S. Makino, “Spectral Smoothing for Frequency-Domain Blind Source Separation,” IWAENC2003, pp.311-314, 2003.
M. Knaak, S. Araki , S. Makino, “Geometrically Constraint ICA for a Convolutive Mixtures of Sound,” ICASSP2003, Vol. II, pp. 725-728, 2003.
M. Knaak, S. Araki, S. Makino, “Geometrically Constraint ICA for a Robust Separation of Sound Mixtures,” ICA2003, pp. 951-956, 2003.
R. Aichner, H. Buchner, S. Araki, S. Makino, “On-line Time-domain Blind Source Separation of Nonstationary Convoluved Signals,” ICA2003, pp. 987-992, 2003.
R. Mukai, H. Sawada, S. de la Kethulle, S. Araki and S. Makino, “Array Geometry Arrangement for Frequency Domain Blind Source Separation,” IWAENC2003, pp.219-222, 2003.
S. Araki, S. Makino, A. Blin, R. Mukai and H. Sawada, “Blind Separation of More Speech than Sensors with Less Distortion by Combining Sparseness and ICA,” IWAENC2003, pp.271-274, 2003.
S. Araki, S. Makino, R. Aichner, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Subband Based Blind Source Separation for Convolutive Mixtures of Speech,” ICASSP2003, Vol. V, pp. 509-512, 2003.
S. Araki, S. Makino, R. Aichner, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Subband Based Blind Source Separation with Appropriate Processing for Each Frequency Band,” ICA2003, pp. 499-504, 2003 .
S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Application of Variational Bayesian Estimation and Clustering to Acoustic Model Adaptation,” Proc. ICASSP'03. vol. 1, pp. 568-571, 2003.
S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Bayesian Acoustic Modeling for Spontaneous Speech Recognition,” Proc. SSPR'03. pp. 47-50, 2003.
T. Nishikawa, H. Saruwatari, K. Shikano, S. Araki , S. Makino, “Multistage ICA for Blind Source Separation of Real Acoustic Convolutive Mixture,” ICA2003, pp. 523-528, 2003
C. Hori, T. Hori, H. Tsukada, H. Isozaki, Y. Sasaki, and E. Maeda, “Spoken Interactive ODQA System: SPIQA,” in Proc. ACL2003, Companion Volume to the Proceedings of the Conference, pp.153-156, 2003.
T. Hori, C. Hori, and Y. Minami, “Speech summarization using weighted finite-state transducers,” in Proc. Eurospeech2003, pp.2817-2820, 2003.
C. Hori, T. Hori, and S. Furui, “Evaluation methods for automatic speech summarization,” in Proc. Eurospeech2003, pp. 2825-2828, 2003.

その他会議予稿

渡部晋治, 南泰浩, 中村篤, 上田修功, “変分ベイズ法の音響モデル適応への応用,” 音響学会講演論文集, 3-3-12, pp. 127-128, March 2003.
堀貴明, 堀智織, 南泰浩, “有限状態トランスデューサによる音声認識文整形要約処理の統合,” 日本音響学会講演論文集, 2-4-4, March 2003.
堀智織, 堀貴明, 磯崎秀樹, 前田英作, 古井貞煕, “音声インタラクティブＯＤＱＡの構築とその評価",” 日本音響学会講演論文集, 2-4-7, March 2003.
渡部晋治, 堀貴明, “ベイズアプローチによるn-gram言語モデリング,” 音響学会講演論文集, 2-6-10, pp. 79-80, September 2003.
P. Zolfaghari, S. Watanabe and S. Katagiri, “Bayesian Modelling of the Spectrum Using Gaussian Mixtures,” 音響学会講演論文集, 2-Q-10, pp. 331-332, September 2003.
堀貴明, 堀智織, 南泰浩, “有限状態トランスデューサによる音声要約法の評価,” 日本音響学会講演論文集, 2-6-14, September 2003.
堀智織, 堀貴明, 塚田元, 磯崎秀樹, “音声インタラクティブＯＤＱＡシステムにおける対話戦略の評価,” 日本音響学会講演論文集, 1-6-17, September 2003.
堀智織, 堀貴明, 古井貞煕, “音声自動要約の評価法,” 日本音響学会講演論文集, 2-6-13, September 2003.
向井良, 澤田宏, 荒木章子, 牧野昭二, “移動音源の低遅延実時間ブラインド分離,” 日本音響学会2003年春季研究発表会講演論文集, pp.779-780, 2003
向井良, 澤田宏, 荒木章子, 牧野昭二, “周波数領域BSSにおける近距離場モデルを用いたパーミュテーションの解法,” 日本音響学会2003年秋季研究発表会講演論文集, pp.589-590, 2003.
荒木章子, Audrey Blin, 牧野昭二, “Blind Separation of More Speech Signals than Sensors using Time-frequency Masking and Mixing Matrix Estimation,” 日本音響学会2003年秋季研究発表会講演論文集, pp.585-586, 2003.
荒木章子, 向井良, 澤田宏, 牧野昭二, “時間周波数マスキングとICAの併用による音源数> マイク数の場合のブラインド音源分離,” 日本音響学会2003年秋季研究発表会講演論文集, pp.587-588, 2003.
荒木章子, 牧野昭二, Robert Aichner, 西川剛樹(NAIST), 猿渡洋(NAIST), “帯域に適した分離手法を用いるサブバンド領域ブラインド音源分離,” 日本音響学会2003年春季研究発表会講演論文集, pp. 781-782, 2003.
渡部晋治, “[招待講演] VBEC:ベイズ的手法にもとづく頑健な音声認識,” 音響学会関西支部第５回若手研究者交流研究発表会, I-2, 2003.
澤田宏, 向井良, 荒木章子, 牧野昭二, “実環境における3音源以上のブラインド分離,” 日本音響学会2003年秋季研究発表会講演論文集, pp.547-548, 2003.
澤田宏, 向井良, 荒木章子, 牧野昭二, “周波数領域ブラインド音源分離におけるpermutation問題の頑健な解法,” 日本音響学会2003年春季研究発表会講演論文集, pp.777-778, 2003