| Offline reinforcement learning with realizability and single-policy concentrability W Zhan, B Huang, A Huang, N Jiang, J Lee Conference on Learning Theory, 2730-2775, 2022 | 169 | 2022 |
| Structural conservation of chemotaxis machinery across A rchaea and B acteria A Briegel, DR Ortega, AN Huang, CM Oikonomou, RP Gunsalus, ... Environmental microbiology reports 7 (3), 414-419, 2015 | 94 | 2015 |
| Asymmetric enzymatic synthesis of allylic amines: a sigmatropic rearrangement strategy CK Prier, TK Hyster, CC Farwell, A Huang, FH Arnold Angewandte Chemie International Edition 55 (15), 4711-4715, 2016 | 91 | 2016 |
| Graph-structured visual imitation M Sieb, Z Xian, A Huang, O Kroemer, K Fragkiadaki Conference on Robot learning, 979-989, 2020 | 90 | 2020 |
| Off-policy risk assessment in contextual bandits A Huang, L Leqi, Z Lipton, K Azizzadenesheli Advances in Neural Information Processing Systems 34, 23714-23726, 2021 | 50 | 2021 |
| Self-improvement in language models: The sharpening mechanism A Huang, A Block, DJ Foster, D Rohatgi, C Zhang, M Simchowitz, JT Ash, ... arXiv preprint arXiv:2412.01951, 2024 | 47 | 2024 |
| Correcting the mythos of kl-regularization: Direct alignment without overoptimization via chi-squared preference optimization A Huang, W Zhan, T Xie, JD Lee, W Sun, A Krishnamurthy, DJ Foster arXiv preprint arXiv:2407.13399, 2024 | 43 | 2024 |
| Morphology of the archaellar motor and associated cytoplasmic cone in Thermococcus kodakaraensis A Briegel, CM Oikonomou, YW Chang, A Kjær, AN Huang, KW Kim, ... EMBO reports 18 (9), 1660-1670, 2017 | 38 | 2017 |
| Is best-of-n the best of them? coverage, scaling, and optimality in inference-time alignment A Huang, A Block, Q Liu, N Jiang, A Krishnamurthy, DJ Foster arXiv preprint arXiv:2503.21878, 2025 | 33 | 2025 |
| On the convergence and optimality of policy gradient for markov coherent risk A Huang, L Leqi, ZC Lipton, K Azizzadenesheli arXiv preprint arXiv:2103.02827, 2021 | 28 | 2021 |
| Reinforcement learning in low-rank mdps with density features A Huang, J Chen, N Jiang International Conference on Machine Learning, 13710-13752, 2023 | 24 | 2023 |
| Computational-statistical tradeoffs at the next-token prediction barrier: Autoregressive and imitation learning under misspecification D Rohatgi, A Block, A Huang, A Krishnamurthy, DJ Foster arXiv preprint arXiv:2502.12465, 2025 | 14 | 2025 |
| Supervised learning with general risk functionals L Leqi, A Huang, Z Lipton, K Azizzadenesheli International Conference on Machine Learning, 12570-12592, 2022 | 14 | 2022 |
| Correcting the mythos of KL-regularization: Direct alignment without overparameterization via Chi-squared preference optimization A Huang, W Zhan, T Xie, JD Lee, W Sun, A Krishnamurthy, DJ Foster arXiv e-prints, arXiv: 2407.13399, 2024 | 12 | 2024 |
| Beyond the return: Off-policy function estimation under user-specified error-measuring distributions A Huang, N Jiang Advances in Neural Information Processing Systems 35, 6292-6303, 2022 | 12 | 2022 |
| Off-policy risk assessment for markov decision processes A Huang, L Leqi, Z Lipton, K Azizzadenesheli International Conference on Artificial Intelligence and Statistics, 5022-5050, 2022 | 12 | 2022 |
| Non-adaptive online finetuning for offline reinforcement learning A Huang, M Ghavamzadeh, N Jiang, M Petrik Reinforcement Learning Conference, 2024 | 5 | 2024 |
| Timing as an action: Learning when to observe and act H Zhou, A Huang, K Azizzadenesheli, D Childers, Z Lipton International Conference on Artificial Intelligence and Statistics, 3979-3987, 2024 | 4 | 2024 |
| Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol P Liu, L Zhao, S Agarwal, J Liu, A Huang, P Amortila, N Jiang arXiv preprint arXiv:2502.08021, 2025 | 3 | 2025 |
| Occupancy-based policy gradient: Estimation, convergence, and optimality A Huang, N Jiang Advances in Neural Information Processing Systems 37, 416-468, 2024 | 3 | 2024 |