| Newton-type methods for non-convex optimization under inexact Hessian information P Xu, F Roosta, MW Mahoney Mathematical Programming 184 (1), 35-70, 2020 | 263 | 2020 |
| Second-order optimization for non-convex machine learning: An empirical study P Xu, F Roosta, MW Mahoney Proceedings of the 2020 SIAM International Conference on Data Mining, 199-207, 2020 | 206 | 2020 |
| Improve transformer models with better relative position embeddings Z Huang, D Liang, P Xu, B Xiang arXiv preprint arXiv:2009.13658, 2020 | 187 | 2020 |
| Giant: Globally improved approximate newton method for distributed optimization S Wang, F Roosta, P Xu, MW Mahoney Advances in neural information processing systems 31, 2018 | 181 | 2018 |
| Sub-sampled Newton methods with non-uniform sampling P Xu, J Yang, F Roosta, C Ré, MW Mahoney Advances in neural information processing systems 29, 2016 | 157 | 2016 |
| Domain adaptation with BERT-based domain classification and data selection X Ma, P Xu, Z Wang, R Nallapati, B Xiang Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource …, 2019 | 154 | 2019 |
| Accelerated stochastic power iteration P Xu, B He, C De Sa, I Mitliagkas, C Re International Conference on Artificial Intelligence and Statistics, 58-67, 2018 | 117 | 2018 |
| Trust region based adversarial attack on neural networks Z Yao, A Gholami, P Xu, K Keutzer, MW Mahoney Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 83 | 2019 |
| Inexact non-convex Newton-type methods Z Yao, P Xu, F Roosta-Khorasani, MW Mahoney arXiv preprint arXiv:1802.06925, 2018 | 81 | 2018 |
| Embedding-based zero-shot retrieval through query generation D Liang, P Xu, S Shakeri, CN Santos, R Nallapati, Z Huang, B Xiang arXiv preprint arXiv:2009.10270, 2020 | 58 | 2020 |
| TRANS-BLSTM: Transformer with bidirectional LSTM for language understanding Z Huang, P Xu, D Liang, A Mishra, B Xiang arXiv preprint arXiv:2003.07000, 2020 | 56 | 2020 |
| Entailment tree explanations via iterative retrieval-generation reasoner DN Ribeiro, S Wang, X Ma, R Dong, X Wei, H Zhu, X Chen, P Xu, ... Findings of the Association for Computational Linguistics: NAACL 2022, 465-475, 2022 | 49 | 2022 |
| Dual reader-parser on hybrid textual and tabular evidence for open domain question answering AH Li, P Ng, P Xu, H Zhu, Z Wang, B Xiang arXiv preprint arXiv:2108.02866, 2021 | 41 | 2021 |
| Newton-MR: Newton's method without smoothness or convexity F Roosta, Y Liu, P Xu, MW Mahoney arXiv preprint arXiv:1810.00303, 2018 | 38 | 2018 |
| Socratic learning: Augmenting generative models to incorporate latent subsets in training data P Varma, B He, D Iter, P Xu, R Yu, C De Sa, C Ré arXiv preprint arXiv:1610.08123, 2016 | 38* | 2016 |
| Passage Ranking with Weak Supervision P Xu, X Ma, R Nallapati, B Xiang arXiv preprint arXiv:1905.05910, 2019 | 22 | 2019 |
| Inexact Newton-CG algorithms with complexity guarantees Z Yao, P Xu, F Roosta, SJ Wright, MW Mahoney IMA Journal of Numerical Analysis 43 (3), 1855-1897, 2023 | 21 | 2023 |
| Attention-guided generative models for extractive question answering P Xu, D Liang, Z Huang, B Xiang arXiv preprint arXiv:2110.06393, 2021 | 21 | 2021 |
| Newton-MR: Inexact Newton method with minimum residual sub-problem solver F Roosta, Y Liu, P Xu, MW Mahoney EURO Journal on Computational Optimization 10, 100035, 2022 | 20 | 2022 |
| Contrastive document representation learning with graph attention networks P Xu, X Chen, X Ma, Z Huang, B Xiang arXiv preprint arXiv:2110.10778, 2021 | 11 | 2021 |