| Ms marco: A human generated machine reading comprehension dataset DF Campos, T Nguyen, M Rosenberg, X Song, J Gao, S Tiwary, ... ArXiv, abs/1611.09268 29, 2016 | 3245* | 2016 |
| Overview of the TREC 2022 deep learning track N Craswell, B Mitra, E Yilmaz, D Campos, J Lin, EM Voorhees, I Soboroff arXiv preprint arXiv:2507.10865, 2025 | 877 | 2025 |
| XGLUE: A new benchmark dataset for cross-lingual pre-training, understanding and generation Y Liang, N Duan, Y Gong, N Wu, F Guo, W Qi, M Gong, L Shou, D Jiang, ... arXiv preprint arXiv:2004.01401, 2020 | 400 | 2020 |
| The optimal bert surgeon: Scalable and accurate second-order pruning for large language models E Kurtic, D Campos, T Nguyen, E Frantar, M Kurtz, B Fineran, M Goin, ... arXiv preprint arXiv:2203.07259, 2022 | 188 | 2022 |
| Ms marco: A human generated machine reading comprehension dataset DF Campos, T Nguyen, M Rosenberg, X Song, J Gao, S Tiwary, ... ArXiv, abs/1611.09268 29, 2016 | 126 | 2016 |
| Leading conversational search by suggesting useful questions C Rosset, C Xiong, X Song, D Campos, N Craswell, S Tiwary, P Bennett Proceedings of the web conference 2020, 1160-1170, 2020 | 116 | 2020 |
| Orcas: 18 million clicked query-document pairs for analyzing search N Craswell, D Campos, B Mitra, E Yilmaz, B Billerbeck Proceedings of the 29th ACM International Conference on Information …, 2020 | 109 | 2020 |
| Ms marco: Benchmarking ranking models in the large-data regime N Craswell, B Mitra, E Yilmaz, D Campos, J Lin Proceedings of the 44th international ACM SIGIR conference on research and …, 2021 | 95 | 2021 |
| TREC deep learning track: Reusable test collections in the large data regime N Craswell, B Mitra, E Yilmaz, D Campos, EM Voorhees, I Soboroff Proceedings of the 44th international ACM SIGIR conference on research and …, 2021 | 80 | 2021 |
| Open domain web keyphrase extraction beyond language modeling L Xiong, C Hu, C Xiong, D Campos, A Overwijk arXiv preprint arXiv:1911.02671, 2019 | 80 | 2019 |
| Arctic-embed: Scalable, efficient, and accurate text embedding models L Merrick, D Xu, G Nuti, D Campos arXiv preprint arXiv:2405.05374, 2024 | 75 | 2024 |
| Ms marco: A human generated machine reading comprehension dataset, 2018 P Bajaj, D Campos, N Craswell, L Deng, J Gao, X Liu, R Majumder, ... URL https://arxiv. org/abs/1611.09268, 2018 | 57 | 2018 |
| Overview of the TREC 2019 deep learning track. CoRR abs/2003.07820 (2020) N Craswell, B Mitra, E Yilmaz, D Campos, EM Voorhees | 56 | 2020 |
| Arctic-embed 2.0: Multilingual retrieval without compromise P Yu, L Merrick, G Nuti, D Campos arXiv preprint arXiv:2412.04506, 2024 | 53 | 2024 |
| Curriculum learning for language modeling D Campos arXiv preprint arXiv:2108.02170, 2021 | 40 | 2021 |
| Ragnarök: A reusable RAG framework and baselines for TREC 2024 retrieval-augmented generation track R Pradeep, N Thakur, S Sharifymoghaddam, E Zhang, R Nguyen, ... European Conference on Information Retrieval, 132-148, 2025 | 38 | 2025 |
| Synthetic test collections for retrieval evaluation HA Rahmani, N Craswell, E Yilmaz, B Mitra, D Campos Proceedings of the 47th International ACM SIGIR Conference on Research and …, 2024 | 38 | 2024 |
| A large-scale study of relevance assessments with large language models: An initial look S Upadhyay, R Pradeep, N Thakur, D Campos, N Craswell, I Soboroff, ... arXiv preprint arXiv:2411.08275, 2024 | 37 | 2024 |
| Keyphrase extraction beyond language modeling L Xiong, C Hu, A OVERWIJK, J Ahmed, DF CAMPOS, C XIONG US Patent 11,250,214, 2022 | 37 | 2022 |
| Reading COmprehension Dataset P Bajaj, D Campos, N Craswell, L Deng, J Gao, X Liu, R Majumder, ... | 36* | 2016 |