Richard He Bai

Cited by

	All	Since 2021
Citations	488	481
h-index	11	11
i10-index	13	12

220

110

165

20202021202220232024202520266 24 36 67 135 205 13

Public access

View all

4 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Navdeep JaitlyAppleVerified email at apple.com
Peng ShiUniversity of WaterlooVerified email at uwaterloo.ca
Jimmy LinUniversity of WaterlooVerified email at uwaterloo.ca
Yizhe ZhangAppleVerified email at apple.com
Tatiana LikhomanenkoAppleVerified email at apple.com
Zijin GuAppleVerified email at apple.com
Ruixiang ZhangAppleVerified email at apple.com
Jiatao GuUPenn CIS / Apple MLRVerified email at apple.com
Ming LiUniversity Professor, University of WaterlooVerified email at uwaterloo.ca
David GrangierApple Machine Learning ResearchVerified email at apple.com
Zakaria AldenehAppleVerified email at apple.com
Joshua M SusskindAppleVerified email at apple.com
Liang HuangProfessor of Computer Science and (by courtesy) Biochemistry/Biophysics, Oregon State UniversityVerified email at oregonstate.edu
Chengqing Zong (宗成庆)Institute of Automation, Chinese Academy of SciencesVerified email at nlpr.ia.ac.cn
Zhuofeng WuAppleVerified email at apple.com
Shuangfei ZhaiApple, Machine Learning ResearchVerified email at apple.com
Mei-Yuh HwangUniversity of WashingtonVerified email at uw.edu
Ronan CollobertMachine Learning Research, AppleVerified email at apple.com
Ihab F. IlyasUniversity of WaterlooVerified email at uwaterloo.ca

Richard He Bai

Other namesRichard Bai, He Bai

Research Scientist, Apple Machine Learning Research

Verified email at apple.com - Homepage

Natural language processing Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Rephrasing the web: A recipe for compute and data-efficient language modeling P Maini, S Seto, H Bai, D Grangier, Y Zhang, N Jaitly ACL 2024, 2024	97	2024
AT: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing H Bai, R Zheng, J Chen, M Ma, X Li, L Huang International Conference on Machine Learning, 1399-1411, 2022	61	2022
Xricl: Cross-lingual retrieval-augmented in-context learning for cross-lingual text-to-sql semantic parsing P Shi, R Zhang, H Bai, J Lin EMNLP 2022, 2022	54	2022
Cross-lingual training of dense retrievers for document retrieval P Shi, R Zhang, H Bai, J Lin Proceedings of the 1st Workshop on Multilingual Representation Learning, 251-253, 2021	36*	2021
Divide-or-Conquer? Which Part Should You Distill Your LLM? Z Wu, H Bai, A Zhang, J Gu, VG Vydiswaran, N Jaitly, Y Zhang EMNLP 2024, 2024	35	2024
Segatron: Segment-Aware Transformer for Language Modeling and Understanding H Bai, P Shi, J Lin, Y Xie, L Tan, K Xiong, W Gao, M Li AAAI 2021, 2020	31*	2020
How Far Are We from Intelligent Visual Deductive Reasoning? Y Zhang, H Bai, R Zhang, J Gu, S Zhai, J Susskind, N Jaitly COLM 2024, 2024	29	2024
Cross-lingual training of neural models for document ranking P Shi, H Bai, J Lin Findings of the Association for Computational Linguistics: EMNLP 2020, 2768-2773, 2020	28	2020
Better language model with hypernym class prediction H Bai, T Wang, A Sordoni, P Shi ACL 2022, 2022	19	2022
Source-Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language H Bai, Y Zhou, J Zhang, L Zhao, MY Hwang, C Zong COLING 2018, 2018	16	2018
dmel: Speech tokenization made simple RH Bai, T Likhomanenko, R Zhang, Z Gu, Z Aldeneh, N Jaitly arXiv preprint arXiv:2407.15835, 2024	12	2024
KGLens: A Parameterized Knowledge Graph Solution to Assess What an LLM Does and Doesn’t Know S Zheng, H Bai, Y Zhang, Y Su, X Niu, N Jaitly arXiv preprint arXiv:2312.11539, 2023	10*	2023
Memory Consolidation for Contextual Spoken Language Understanding with Dialogue Logistic Inference H Bai, Y Zhou, J Zhang, C Zong ACL 2019, 2019	10	2019
Denoising lm: Pushing the limits of error correction models for speech recognition Z Gu, T Likhomanenko, H Bai, E McDermott, R Collobert, N Jaitly arXiv preprint arXiv:2405.15216, 2024	9	2024
Semantics of the unwritten: The effect of end of paragraph and sequence tokens on text generation with GPT2 H Bai, P Shi, J Lin, L Tan, K Xiong, W Gao, J Liu, M Li ACL 2021, 2020	9*	2020
Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation A Mousavi, X Zhan, H Bai, P Shi, T Rekatsinas, B Han, Y Li, J Pound, ... COLING 2024, 2023	7	2023
Cross-lingual text-to-SQL semantic parsing with representation mixup P Shi, L Song, L Jin, H Mi, RH Bai, J Lin, D Yu Findings of the Association for Computational Linguistics: EMNLP 2022, 5296-5306, 2022	6	2022
Rephrasing the web: A recipe for compute and data-efficient language modeling, 2024 P Maini, S Seto, H Bai, D Grangier, Y Zhang, N Jaitly URL https://arxiv. org/abs/2401.16380, 0	5
Visatronic: A multimodal decoder-only model for speech synthesis A Gupta, T Likhomanenko, KD Yang, RH Bai, Z Aldeneh, N Jaitly arXiv preprint arXiv:2411.17690, 2024	3	2024
Training bilingual lms with data constraints in the targeted language S Seto, M Ter Hoeve, RH Bai, N Schluter, D Grangier Findings of the Association for Computational Linguistics: ACL 2025, 19096-19122, 2025	2	2025

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors