| Physics of language models: Part 2.1, grade-school math and the hidden reasoning process T Ye, Z Xu, Y Li, Z Allen-Zhu arXiv preprint arXiv:2407.20311, 2024 | 100 | 2024 |
| Global convergence of gradient descent for asymmetric low-rank matrix factorization T Ye, SS Du Advances in Neural Information Processing Systems 34, 1429-1439, 2021 | 73 | 2021 |
| Physics of language models: Part 2.2, how to learn from mistakes on grade-school math problems T Ye, Z Xu, Y Li, Z Allen-Zhu arXiv preprint arXiv:2408.16293, 2024 | 31 | 2024 |
| Physics of language models: Part 2.1, grade-school math and the hidden reasoning process, 2024 T Ye, Z Xu, Y Li, Z Allen-Zhu URL https://arxiv. org/abs/2407.20311, 0 | 16 | |
| Quantum complementarity approach to device-independent security X Zhang, P Zeng, T Ye, HK Lo, X Ma Physical Review Letters 131 (14), 140801, 2023 | 15 | 2023 |
| Physics of language models: Part 2.2, how to learn from mistakes on grade-school math problems, 2024b T Ye, Z Xu, Y Li, Z Allen-Zhu URL https://arxiv. org/abs/2408.16293, 0 | 8 | |
| Physics of Language Models: Part 2.1 T Ye, Z Xu, Y Li, Z Allen-Zhu How to Learn From Mistakes on Grade-School Math Problems. arXiv preprint …, 2024 | 7 | 2024 |
| DEED: A general quantization scheme for communication efficiency in bits T Ye, P Xiao, R Sun arXiv preprint arXiv:2006.11401, 2020 | 3 | 2020 |
| Learning to Reason: Intuition Formation in Language Models T Ye Carnegie Mellon University, 2025 | | 2025 |
| Global Convergence Rate of Gradient Flow for Asymmetric Matrix Factorization T Ye, SS Du | | |