[go: up one dir, main page]

Follow
Koichi Saito
Title
Cited by
Cited by
Year
Gibbsddrm: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration
N Murata, K Saito, CH Lai, Y Takida, T Uesaka, Y Mitsufuji, S Ermon
International conference on machine learning (ICML 2023), 25501-25522, 2023
822023
Versa: A versatile evaluation toolkit for speech, audio, and music
J Shi, H Shim, J Tian, S Arora, H Wu, D Petermann, JQ Yip, Y Zhang, ...
Proceedings of the 2025 Annual Conference of the North American Chapter of …, 2024
402024
Unsupervised vocal dereverberation with diffusion-based generative models
K Saito, N Murata, T Uesaka, CH Lai, Y Takida, T Fukui, Y Mitsufuji
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
342023
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
K Saito, D Kim, T Shibuya, CH Lai, Z Zhong, Y Takida, Y Mitsufuji
International Conference on Learning Representations (ICLR), 2025, https …, 2025
21*2025
Vrdmg: Vocal restoration via diffusion posterior sampling with multiple guidance
C Hernandez-Olivan, K Saito, N Murata, CH Lai, MA Martínez-Ramirez, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
182024
Specmaskgit: Masked generative modeling of audio spectrograms for efficient audio synthesis and beyond
M Comunità, Z Zhong, A Takahashi, S Yang, M Zhao, K Saito, Y Ikemiya, ...
25th International Society for Music Information Retrieval (ISMIR) Conference, 2024
172024
Training speech enhancement systems with noisy speech datasets
K Saito, S Uhlich, G Fabbro, Y Mitsufuji
arXiv preprint arXiv:2105.12315, 2021
162021
Sampling-frequency-independent convolutional layer and its application to audio source separation
K Saito, T Nakamura, K Yatabe, H Saruwatari
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2928-2943, 2022
152022
Sampling-frequency-independent audio source separation using convolution layer based on impulse invariant method
K Saito, T Nakamura, K Yatabe, Y Koizumi, H Saruwatari
2021 29th European Signal Processing Conference (EUSIPCO), 321-325, 2021
102021
Aligning Text-to-Music Evaluation with Human Preferences
Y Huang, Z Novack, K Saito, J Shi, S Watanabe, Y Mitsufuji, J Thickstun, ...
26th International Society for Music Information Retrieval (ISMIR) Conference, 2025
72025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
J Tanke, T Shibuya, K Uchida, K Saito, Y Mitsufuji
Proceedings of the Computer Vision and Pattern Recognition Conference, 2868-2877, 2025
62025
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
J Chen, Z Wang, A Zeng, Y Fu, X Yu, S Cen, J Tanke, Y Chen, K Saito, ...
Neural Information Processing Systems (NeurIPS 2025) D&B Track, 2025
32025
Dismix: Disentangling mixtures of musical instruments for source-level pitch and timbre manipulation
YJ Luo, KW Cheuk, W Choi, T Uesaka, K Toyama, K Saito, CH Lai, ...
arXiv preprint arXiv:2408.10807, 2024
22024
FoleyBench: A Benchmark For Video-to-Audio Models
S Dixit, K Saito, Z Zhong, Y Mitsufuji, C Donahue
arXiv preprint arXiv:2511.13219, 2025
12025
Schrödinger Bridge Consistency Trajectory Models for Speech Enhancement
S Nishigori, K Saito, N Murata, M Hirano, S Takahashi, Y Mitsufuji
2025 IEEE Workshop on Applications of Signal Processing to Audio and …, 2025
12025
Disentangling Multi-instrument Music Audio for Source-level Pitch and Timbre Manipulation
YJ Luo, KW Cheuk, W Choi, WH Liao, K Toyama, T Uesaka, K Saito, ...
Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound …, 2024
12024
Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal
W Xu, KJ Cheng, K Saito, MJ Mirza, T Li, Y Liu, AH Liu, L Wang, M Ishii, ...
arXiv preprint arXiv:2512.12875, 2025
2025
SoundReactor: Frame-level Online Video-to-Audio Generation
K Saito, J Tanke, C Simon, M Ishii, K Shimada, Z Novack, Z Zhong, ...
arXiv preprint arXiv:2510.02110, 2025
2025
Music Arena: Live Evaluation for Text-to-Music
Y Kim, W Chi, AN Angelopoulos, WL Chiang, K Saito, S Watanabe, ...
Neural Information Processing Systems (NeurIPS 2025) Creative AI Track: Humanity, 2025
2025
From White Noise to Symphony: Diffusion Models for Music and Sound -- ISMIR24 Diffusion Model Tutorial
CH Lai, B Nguyen, K Saito, S Ermon, Y Mitsufuji
https://github.com/ChiehHsinJesseLai/ISMIR24DiffusionModelTutorial, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20