Computational Modeling Research Group

[Japanese|English]

The Computational Modeling Research Group aims to research fundamental technologies for modeling and processing events involving multiple modalities using physical, mathematical, statistical, and psychological models to realize valuable functions such as real-world sensing and communication function enhancement.

Group Leader Yasunori Ohishi

Publications

2024

Peer-reviewed Conference Papers

  1. Shiqi Zhang, Zheng Qiu, Daiki Takeuchi, Noboru Harada & Shoji Makino (2024). Unrestricted Global-Phase-Bias Aware Single-channel Speech Enhancement with Conformer-based Metric GAN. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
  2. Yuto Kondo, Hirokazu Kameoka, Kou Tanaka & Takuhiro Kaneko (2024). SELECTING N-LOWEST SCORES FOR TRAINING MOS PREDICTION MODELS. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
  3. Takuhiro Kaneko, Hirokazu Kameoka & Kou Tanaka (2024). Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea.
  4. Bo He, Shiqi Zhang, Xianrui Wang, Zheng Qiu, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada & Shoji Makino (2024). Light Gated Multi Mini-patch Extractor for Audio Classification. ICASSP2024 Satellite Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2024).

2023

Journal Papers

  1. Phuc Duc Nguyen, Yoshifumi Shiraki, Kenji Ishikawa, Jun Muramatsu, Noboru Harada & Takehiro Moriya (2023). Distribution Matching for Dimming Control in Visible-Light Region-of-Interest Signaling. IEEE Photonics Journal, 15 (1), 1-14.
  2. Denny Hermawanto, Kenji Ishikawa, Kohei Yatabe & Yasuhiro Oikawa (2023). Determination of Microphone Acoustic Center from Sound Field Projection Measured by Optical Interferometry. The Journal of the Acoustical Society of America, -.
  3. Shogo Seki, Hirokazu Kameoka, Takuhiro Kaneko & Kou Tanaka (2023). Non-parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder. IEEE Access, 11, 44590-44599.
  4. Samuel A. Verburg, Kenji Ishikawa, Efren Fernandez-Grande & Yasuhiro Oikawa (2023). A Century of Acousto-Optics: From Early Discoveries to Modern Sensing of Sound with Light. Acoustics Today, 19 (3), 54-62.
  5. Ryosuke Sugiura, Yutaka Kamamoto & Takehiro Moriya (2023). General form of almost instantaneous fixed-to-variable-length codes and optimal code tree construction. IEEE Transactions on Information Theory, 69 (12).
  6. Kenji Ishikawa, Yoshifumi Shiraki, Takehiro Moriya, Atsushi Ishizawa, Kenichi Hitachi & Katsuya Oguri (2023). Comprehensive Noise Analysis for Acousto-optic Measurement of Airborne Sound. IEEE Trans on Instrumentation and Measurement, 73 (7000309).

Peer-reviewed Conference Papers

  1. Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko & Shogo Seki (2023). Distilling sequence-to-sequence voice conversion models for streaming conversion applications. Proc. IEEE Spoken Language Technology Workshop (SLT). Doha, Qatar.
  2. Shogo Seki, Hirokazu Kameoka, Kou Tanaka & Takuhiro Kaneko (2023). JSV-VC: Jointly Trained Speaker Verification and Voice Conversion Models. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes,Greek.
  3. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2023). Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes,Greek.
  4. Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka & Shogo Seki (2023). Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Island of Rhodes,Greek.
  5. Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka & Shogo Seki (2023). iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN. Proc. Interspeech. Dublin, Ireland.
  6. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2023). Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation. Proc. Interspeech. Dublin, Ireland.
  7. Kou Tanaka, Takuhiro Kaneko, Hirokazu Kameoka & Shogo Seki (2023). CFVC: Conditional Filtering for Controllable Voice Conversion. Proc. Interspeech. Dublin, Ireland.
  8. Noboru Harada, Daisuke Niizumi, Yasunori Ohishi, Daiki Takeuchi & Masahiro Yasuda (2023). First-Shot Anomaly Sound Detection for Machine Condition Monitoring: A Domain Generalization Baseline. Proc. European Signal Processing Conference(EUSIPCO). Helsinki, Finland.
  9. Shogo Seki, Kanami Imamura, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka & Noboru Harada (2023). W2N-AVSC: Audiovisual Extension for Whisper-to-Normal Speech Conversion. Proc. European Signal Processing Conference(EUSIPCO). Helsinki, Finland.
  10. Kou Tanaka, Hirokazu Kameoka & Takuhiro Kaneko (2023). PRVAE-VC: Non-Parallel Many-to-Many Voice Conversion with Perturbation-Resistant Variational Autoencoder. Proc.ISCA Speech Synthesis Workshop(SSW). Grenoble, France.
  11. Boxin Liu, Shiqi Zhang, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada & Shoji Makino (2023). Masked modeling duo vision transformer with multi-layer feature fusion on respiratory sound classification. Proc. Detection and Classification of Acoustic Scenes and Events(DCASE) Workshop. Tampere, Finland.
  12. Chihiro Watanabe & Hirokazu Kameoka (2023). DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.
  13. Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo & Yohei Kawaguchi (2023). Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring. Proc. Detection and Classification of Acoustic Scenes and Events(DCASE) Workshop. Tampere, Finland.
  14. Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada & Kunio Kashino (2023). Similarity-discrepancy disentanglement for audio difference captioning. Proc. Detection and Classification of Acoustic Scenes and Events(DCASE) Workshop. Tampere, Finland.
  15. Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi & Masahiro Yasuda (2023). ToyADMOS2+: New Toyadmos Data and Benchmark Results of the First-Shot Anomalous Sound Event Detection Baseline. Proc. Detection and Classification of Acoustic Scenes and Events(DCASE) Workshop. Tampere, Finland.
  16. Keisuke Takazawa, Hirokazu Kameoka & Masahiro Yukawa (2023). Multiple Sound Source Tracking Based on Generative Modeling and Recursive Bayesian Filtering of Spatial Gradient Spectra. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Taipei, Taiwan.
  17. Noboru Harada, Daisuke Niizumi, Yasunori Ohishi, Daiki Takeuchi & Masahiro Yasuda (2023). First-shot anomaly sound detection for machine condition monitoring: A Domain Generalization baseline. Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC). Helsinki, Finland.
  18. Haruka Nozawa, Mayuko Imanishi, Yasuhiro Oikawa & Kenji Ishikawa (2023). Physical-model-based reconstruction of three-dimensional sound field from multi-directional measurement by parallel phase-shift interferometry. Proc. The Australian Acoustical Society(Acoustics2023). Sydney, Australia.

2022

Journal Papers

  1. Kenji Ishikawa, Kohei Yatabe, Yasuhiro Oikawa, Yoshifumi Shiraki & Takehiro Moriya (2022). Speckle holographic imaging of sound field using fresnel lens. Optics Letters.
  2. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). BYOL for audio: Exploring pre-trained general-purpose audio representations. IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP).
  3. Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi & Shoko Araki (2022). Soundbeam: target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
  4. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations. Proceedings of Machine Learning Research (PMLR).
  5. Li Li, Kohei Yatabe, Hirokazu Kameoka & Shoji Makino (2022). FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).

Peer-reviewed Conference Papers

  1. Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada & Kunio Kashino (2022). ConceptBeam: Concept driven target speech extraction. Proc. ACM International Conference on Multimedia(ACMMM). Lisbon, Portugal.
  2. Denny Hermawanto, Kenji Ishikawa, Kohei Yatabe & Yasuhiro Oikawa (2022). Visualization of microphone's acoustic center using phase-shifting interferometry. Proc. International Congress on Acoustics (ICA). Gyeongju,Korea.

Members

Related Research Groups