| Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio G Chen, S Chai, G Wang, J Du, WQ Zhang, C Weng, D Su, D Povey, ... arXiv preprint arXiv:2106.06909, 2021 | 376 | 2021 |
| Kespeech: An open source speech dataset of mandarin and its eight subdialects Z Tang, D Wang, Y Xu, J Sun, X Lei, S Zhao, C Wen, X Tan, C Xie, S Zhou, ... Thirty-fifth Conference on Neural Information Processing Systems Datasets …, 2021 | 70 | 2021 |
| Didispeech: A large scale mandarin speech corpus T Guo, C Wen, D Jiang, N Luo, R Zhang, S Zhao, W Li, C Gong, W Zou, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 65 | 2021 |
| Towards end-to-end code-switching speech recognition N Luo, D Jiang, S Zhao, C Gong, W Zou, X Li arXiv preprint arXiv:1810.13091, 2018 | 65 | 2018 |
| Comparable study of modeling units for end-to-end mandarin speech recognition W Zou, D Jiang, S Zhao, G Yang, X Li 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 40 | 2018 |
| Chathome: Development and evaluation of a domain-specific language model for home renovation C Wen, X Sun, S Zhao, X Fang, L Chen, W Zou arXiv preprint arXiv:2307.15290, 2023 | 37 | 2023 |
| Sari: Structured audio reasoning via curriculum-guided reinforcement learning C Wen, T Guo, S Zhao, W Zou, X Li arXiv preprint arXiv:2504.15900, 2025 | 12 | 2025 |
| Semantic data augmentation for end-to-end mandarin speech recognition J Sun, Z Tang, H Yin, W Wang, X Zhao, S Zhao, X Lei, W Zou, X Li arXiv preprint arXiv:2104.12521, 2021 | 12 | 2021 |
| Audio-visual wake word spotting system for misp challenge 2021 Y Xu, J Sun, Y Han, S Zhao, C Mei, T Guo, S Zhou, C Xie, W Zou, X Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 11 | 2022 |
| GigaSpeech: An Evolving G Chen, S Chai, G Wang, J Du, WQ Zhang, C Weng, D Su, D Povey, ... Multi-domain ASR Corpus with 10, 2021-1965, 2021 | 11 | 2021 |
| Structured audio reasoning via curriculum-guided reinforcement learning C Wen, T Guo, S Zhao, W Zou, XL SARI arXiv preprint arXiv:2504.15900, 2025 | 9 | 2025 |
| Advancing speech language models by scaling supervised fine-tuning with over 60,000 hours of synthetic speech dialogue data S Zhao, T Guo, B Xiang, T Wan, Q Niu, W Zou, X Li arXiv preprint arXiv:2412.01078, 2024 | 5 | 2024 |
| An Analysis of Decoding for Attention-Based End-to-End Mandarin Speech Recognition D Jiang, W Zou, S Zhao, G Yang, X Li 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 2 | 2018 |
| Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models B Xiang, S Zhao, T Guo, W Zou Proceedings of the 2025 Conference on Empirical Methods in Natural Language …, 2025 | 1 | 2025 |
| Technical report: Competition solution for bettermixture S Zhao, X Fang arXiv preprint arXiv:2403.13233, 2024 | 1 | 2024 |