Media Information Laboratory

[Japanese|English]

Message

Dr.Akisato Kimura Executive Manager

Dr.Akisato Kimura
Executive Manager

The Media Information Laboratory is organized into five research groups: media recognition, signal processing, computational modeling, biomedical informatics, and computing theory. We are promoting basic researches on information processing technology and fundamental principles related to "media", which is a medium for transmitting information in communication.

"Media" is a medium for transmitting information in communication among people or between people and computers. It can also be regarded as data obtained by observing various information in the real world and virtual world. Based on this idea, not only sounds and images that can be observed through sight and hearing, various observable data from real and cyber worlds can be subject to media information processing.

In this way, we take a broader view of the state of media information processing. We are aiming to approach the fundamental principle of communication and to develop technologies that enrich our lives in the real world and virtual world --- by bringing together the experience and knowledge of experts in a wide range of fields such as real-world measurement, modeling, signal processing, media recognition understanding, media generation, and the basic mathematical theories and algorithms that support them.

News

  • 03/11/2021

    [Award] Tsubasa Ochiai has received The 16th Itakura Prize Innovative Young Researcher Award from the Acoustical Society of Japan.
    "Joint Optimization of Microphone Array Signal Processing and Speech Recognition"

    https://acoustics.jp/awards/itakura/

  • 02/18/2021

    [Award] Hirokazu Kameoka has received The 10th RIEC Award from Research Institute of Electrical Communication Tohoku University.
    "Audio Signal Decomposition and Scene Analysis"

    http://www.riec.tohoku.ac.jp/ja/info/riec-award/r2/

  • 01/28/2021

    Rintaro Ikeshita has received the 49th Awaya Kiyoshi Science Promotion Award from the Acoustical Society of Japan.
    Rintaro Ikeshita and Tomohiro Nakatani, "Multiplicative update algorithms for independent vector analysis," 2020 Autumn meeting of Acoustical Society of Japan, 1-1-13, 2020.

  • 01/21/2021

    Onkar Krishna, Go Irie, Xiaomeng Wu, Takahito Kawanishi and Kunio Kashino has received a "Best Research Paper Award Honorable Mention" at the 26th Symposium on Sensing via Image Information.
    Onkar Krishna, Go Irie, Xiaomeng Wu, Takahito Kawanishi and Kunio Kashino(2020). "Adaptive Spotting: 3D Point Cloud Object Search Based on Deep Reinforcement Learning," The 26th Symposium on Sensing via Image Information.

Research groups

Research Index

Publications

2023

Peer-reviewed Conference Papers

  1. Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko & Shogo Seki (2023). Distilling sequence-to-sequence voice conversion models for streaming conversion applications. Proc. IEEE Spoken Language Technology Workshop (SLT). Doha, Qatar.

2022

Journal Papers

  1. Ken Mano, Hideki Sakurada & Yasuyuki Tsukada (2022). Quality and quantity pair as trust metric. IEICE Transactions on Information and Systems.
  2. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations. EEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
  3. Wangyou Zhang, Xuankai Chang, Christoph Boeddeker, Tomohiro Nakatani, Shinji Watanabe & Yanmin Qian (2022). End-to-end dereverberation, beamforming, and speech recognition in a cocktail party. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 30, 3173-3188.
  4. Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi & Shoko Araki (2022). Soundbeam: target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
  5. Kenji Ishikawa, Kohei Yatabe, Yasuhiro Oikawa, Yoshifumi Shiraki & Takehiro Moriya (2022). Speckle holographic imaging of sound field using fresnel lens. Optics Letters.
  6. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). BYOL for audio: Exploring pre-trained general-purpose audio representations. IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP).
  7. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations. Proceedings of Machine Learning Research (PMLR).
  8. Li Li, Kohei Yatabe, Hirokazu Kameoka & Shoji Makino (2022). FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
  9. X. Wu, Y. Sun, A. Kimura, and K. Kashino, "Contrast enhancement based on reflectance-oriented probabilistic equalization," Signal Processing, vol. 194, 2022.

Peer-reviewed Conference Papers

  1. Masato Wakayama (2022). Quantum Interaction and number theory, representation theory - modular forms a bit beyond, infinite symmetric group, Fuchsian ODE. Painlevé Seminar.
  2. Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada & Kunio Kashino (2022). ConceptBeam: Concept driven target speech extraction. Proc. ACM International Conference on Multimedia(ACMMM). Lisbon, Portugal.
  3. Seiya Matsuda, Akisato Kimura & Seiichi Uchida (2022). Font generation with missing impression labels. in Proc. International Conference on Pattern Recognition (ICPR). Montreal Quebec, Canada.
  4. Kana Goto, Tetsuya Ueda, Li Li, Takeshi Yamada & Shoji Makino (2022). Geometrically constrained independent vector analysis with auxiliary function approach and iterative source steering. in Proc. European Signal Processing Conference (EUSIPCO). Belgrade, Serbia.
  5. Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada & Kunio Kashino (2022). Composing general audio representation by fusing multi-layer features of pre-trained model. in Proc. European Signal Processing Conference (EUSIPCO). Belgrade, Serbia.
  6. Natsuki Ueno & Hirokazu Kameoka (2022). Multiple sound source localization based on stochastic modeling of spatial gradient spectra. in Proc. European Signal Processing Conference (EUSIPCO). Belgrade, Serbia.
  7. Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka & Shogo Seki (2022). MISRNet: Lightweight neural vocoder using multi-input single shared residual blocks. in Proc. Interspeech. Incheon, Korea.
  8. Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki & Kou Tanaka (2022). CAUSE: Crossmodal action unit sequence estimation from speech. in Proc. Interspeech. Incheon, Korea.
  9. Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada & Kunio Kashino (2022). Introducing auxiliary text query-modifier to content-based audio retrieval. in Proc. Interspeech. Incheon, Korea.
  10. Takashi Shibata, Masatoshi Okutomi & Masayuki Tanaka (2022). Robustizing object detection networks using augmented feature pooling. in Proc. Asian Conference on Computer Vision (ACCV). Macau SAR, China.
  11. Yu Moriyasu, Takashi Shibata, Masayuki Tanaka & Masatoshi Okutomi (2022). Top-K ensemble for semantic segmentation robust against unexpected degradation. Proc. IEEE International Conference on Consumer Electronics(ICCE). Bordeaux,France.
  12. Yasuhiro Fujiwara, Masahiro Nakano, Atsutoshi Kumagai, Yasutoshi Ida, Akisato Kimura & Naonori Ueda (2022). Fast binary network hashing via graph clustering. Proc. IEEE BigData. Osaka, Japan.
  13. Denny Hermawanto, Kenji Ishikawa, Kohei Yatabe & Yasuhiro Oikawa (2022). Visualization of microphone's acoustic center using phase-shifting interferometry. Proc. International Congress on Acoustics (ICA). Gyeongju,Korea.
  14. M. Nakano, R. Nishikimi, Y. Fujiwara, A. Kimura, T. Yamada, and N. Ueda, "Nonparametric relational models with superrectangulation," in Proc. International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
  15. G. Irie, T. Shibata, and A. Kimura, "Co-attention-guided bilinear model for echo-based depth estimation," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  16. T. Kaneko, K. Tanaka, H. Kameoka, and S. Seki, "Fastening and lightening convolutional mel-spectrogram vocoder using inverse short-time fourier transform," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  17. S. Seki, H. Kameoka, and L. Li, "Exploring and improving multichannel variational autoencoder for underdetermined source separation," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  18. L. Li, H. Kameoka, and S. Seki, "HBP: An efficient block permutation solver using hungarian algorithm and spectrogram inpainting for multichannel audio source separation," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  19. H. Kameoka, S. Seki, L. Li, and C. Watanabe, "AttentionPIT: Soft permutation invariant training for audio source separation with attention mechanism," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  20. T. Kaneko, "AR-NeRF: Unsupervised learning of depth and defocus effects from natural images with aperture rendering neural radiance fields," in Proc. Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  21. S. Yoneda, G. Irie, T. Shibata, M. Nishiyama, and I. Yoshio, "Deep segmentation network without mask image supervision for 2D image registration," in Proc. International Workshop on Frontiers of Computer Vision (IW-FCV), 2022.
  22. M. Ueda, A. Kimura, and S. Uchida, "Font shape-to-impression translation," in Proc. International Workshop on Document Analysis Systems (DAS), 2022.
  23. C. Kabore, M. Tsuchida, I. Suzuki, S. Sugaya, A. Kimura, and N. Harada, "Prototyping of low-cost color enhancement lighting using multicolor LEDs," in Proc. International Symposium on Electronic Imaging (EI), 2022.

Members

Executive Manager

Fellow

Senior Distinguished Researchers

Recognition Research Group

Signal Processing Research Group

Computing Theory Research Group

Computational Modeling Research Group

Biomedical Informatics Research Group

Access

Last Update: 10/5/2023