Koichi Saito

Cited by

	All	Since 2021
Citations	274	274
h-index	9	9
i10-index	9	9

180

135

202220232024202520268 34 68 161 3

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yuki MitsufujiDistinguished Engineer, Sony; Visiting Research Professor, New York UniversityVerified email at sony.com
Chieh-Hsin Lai (Jesse)Sony AI; Visiting Assistant Professor of Applied Math, National Chiao Tung UniversityVerified email at sony.com
Yuhta TakidaSony AIVerified email at sony.com
Naoki MurataSony ResearchVerified email at sony.com
Toshimitsu UesakaSony Research Inc.Verified email at sony.com
Takashi ShibuyaSonyVerified email at sony.com
Zhi ZhongSonyVerified email at sony.com
Chris DonahueAssistant Professor, CMU CSD; Research Scientist, Google DeepMind (part time)Verified email at cmu.edu
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
Jiatong Shi (史嘉彤)Carnegie Mellon UniversityVerified email at andrew.cmu.edu
Kohei YatabeTokyo University of Agriculture and TechnologyVerified email at go.tuat.ac.jp
Hiroshi SaruwatariProfessor, The University of TokyoVerified email at ipc.i.u-tokyo.ac.jp
WeiHsiang LiaoSony Research Inc.Verified email at sony.com
Dongjun KimStanford UniversityVerified email at stanford.edu
Marco A. Martinez-RamirezMusic technology researcher, Sony AIVerified email at sony.com
Zachary NovackCS PhD Student, UC - San DiegoVerified email at ucsd.edu
Stefan UhlichSony Europe Ltd, ZNL DeutschlandVerified email at lss.uni-stuttgart.de
Yuma KoizumiGoogle DeepMindVerified email at google.com
Yin-Jyun (Jun) LuoSpotifyVerified email at spotify.com

Koichi Saito

Sony AI

Verified email at sony.com

Deep generative modeling Sound Generation Audio signal processing Music Technology World models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gibbsddrm: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration N Murata, K Saito, CH Lai, Y Takida, T Uesaka, Y Mitsufuji, S Ermon International conference on machine learning (ICML 2023), 25501-25522, 2023	82	2023
Versa: A versatile evaluation toolkit for speech, audio, and music J Shi, H Shim, J Tian, S Arora, H Wu, D Petermann, JQ Yip, Y Zhang, ... Proceedings of the 2025 Annual Conference of the North American Chapter of …, 2024	40	2024
Unsupervised vocal dereverberation with diffusion-based generative models K Saito, N Murata, T Uesaka, CH Lai, Y Takida, T Fukui, Y Mitsufuji ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	34	2023
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation K Saito, D Kim, T Shibuya, CH Lai, Z Zhong, Y Takida, Y Mitsufuji International Conference on Learning Representations (ICLR), 2025, https …, 2025	21*	2025
Vrdmg: Vocal restoration via diffusion posterior sampling with multiple guidance C Hernandez-Olivan, K Saito, N Murata, CH Lai, MA Martínez-Ramirez, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	18	2024
Specmaskgit: Masked generative modeling of audio spectrograms for efficient audio synthesis and beyond M Comunità, Z Zhong, A Takahashi, S Yang, M Zhao, K Saito, Y Ikemiya, ... 25th International Society for Music Information Retrieval (ISMIR) Conference, 2024	17	2024
Training speech enhancement systems with noisy speech datasets K Saito, S Uhlich, G Fabbro, Y Mitsufuji arXiv preprint arXiv:2105.12315, 2021	16	2021
Sampling-frequency-independent convolutional layer and its application to audio source separation K Saito, T Nakamura, K Yatabe, H Saruwatari IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2928-2943, 2022	15	2022
Sampling-frequency-independent audio source separation using convolution layer based on impulse invariant method K Saito, T Nakamura, K Yatabe, Y Koizumi, H Saruwatari 2021 29th European Signal Processing Conference (EUSIPCO), 321-325, 2021	10	2021
Aligning Text-to-Music Evaluation with Human Preferences Y Huang, Z Novack, K Saito, J Shi, S Watanabe, Y Mitsufuji, J Thickstun, ... 26th International Society for Music Information Retrieval (ISMIR) Conference, 2025	7	2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis J Tanke, T Shibuya, K Uchida, K Saito, Y Mitsufuji Proceedings of the Computer Vision and Pattern Recognition Conference, 2868-2877, 2025	6	2025
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation J Chen, Z Wang, A Zeng, Y Fu, X Yu, S Cen, J Tanke, Y Chen, K Saito, ... Neural Information Processing Systems (NeurIPS 2025) D&B Track, 2025	3	2025
Dismix: Disentangling mixtures of musical instruments for source-level pitch and timbre manipulation YJ Luo, KW Cheuk, W Choi, T Uesaka, K Toyama, K Saito, CH Lai, ... arXiv preprint arXiv:2408.10807, 2024	2	2024
FoleyBench: A Benchmark For Video-to-Audio Models S Dixit, K Saito, Z Zhong, Y Mitsufuji, C Donahue arXiv preprint arXiv:2511.13219, 2025	1	2025
Schrödinger Bridge Consistency Trajectory Models for Speech Enhancement S Nishigori, K Saito, N Murata, M Hirano, S Takahashi, Y Mitsufuji 2025 IEEE Workshop on Applications of Signal Processing to Audio and …, 2025	1	2025
Disentangling Multi-instrument Music Audio for Source-level Pitch and Timbre Manipulation YJ Luo, KW Cheuk, W Choi, WH Liao, K Toyama, T Uesaka, K Saito, ... Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound …, 2024	1	2024
Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal W Xu, KJ Cheng, K Saito, MJ Mirza, T Li, Y Liu, AH Liu, L Wang, M Ishii, ... arXiv preprint arXiv:2512.12875, 2025		2025
SoundReactor: Frame-level Online Video-to-Audio Generation K Saito, J Tanke, C Simon, M Ishii, K Shimada, Z Novack, Z Zhong, ... arXiv preprint arXiv:2510.02110, 2025		2025
Music Arena: Live Evaluation for Text-to-Music Y Kim, W Chi, AN Angelopoulos, WL Chiang, K Saito, S Watanabe, ... Neural Information Processing Systems (NeurIPS 2025) Creative AI Track: Humanity, 2025		2025
From White Noise to Symphony: Diffusion Models for Music and Sound -- ISMIR24 Diffusion Model Tutorial CH Lai, B Nguyen, K Saito, S Ermon, Y Mitsufuji https://github.com/ChiehHsinJesseLai/ISMIR24DiffusionModelTutorial, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors