| Tülu 3: Pushing Frontiers in Open Language Model Post-Training N Lambert*, J Morrison*, V Pyatkin*, S Huang*, H Ivison*, F Brahman*, ... arXiv preprint arXiv:2411.15124, 2024 | 693* | 2024 |
| Rewardbench: Evaluating reward models for language modeling N Lambert, V Pyatkin, J Morrison, LJ Miranda, BY Lin, K Chandu, N Dziri, ... Findings of the Association for Computational Linguistics: NAACL 2025, 1755–1797, 2025 | 492* | 2025 |
| PySwarms: a research toolkit for Particle Swarm Optimization in Python LJV Miranda The Journal of Open Source Software 3 (21), 2, 2018 | 465 | 2018 |
| 2 OLMo 2 Furious T OLMo, P Walsh, L Soldaini, D Groeneveld, K Lo, S Arora, A Bhagia, ... arXiv preprint arXiv:2501.00656, 2024 | 267* | 2024 |
| Consent in Crisis: The Rapid Decline of the AI Data Commons S Longpre, R Mahari, AN Lee, CS Lund, H Oderinwale, W Brannon, ... The Thirty-eight Conference on Neural Information Processing Systems …, 2024 | 105 | 2024 |
| Mmteb: Massive multilingual text embedding benchmark K Enevoldsen, I Chung, I Kerboua, M Kardos, A Mathur, D Stap, J Gala, ... arXiv preprint arXiv:2502.13595, 2025 | 99* | 2025 |
| Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark S Mayhew, T Blevins, S Liu, M Šuppa, H Gonen, JM Imperial, BF Karlsson, ... Proceedings of the 2024 Conference of the North American Chapter of the …, 2023 | 40 | 2023 |
| M-RewardBench: Evaluating Reward Models in Multilingual Settings S Gureja*, LJV Miranda*, SB Islam*, R Maheshwary*, D Sharma, ... Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025 | 35* | 2025 |
| SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages H Lovenia, R Mahendra, SM Akbar, LJV Miranda, J Santoso, E Aco, ... Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | 24* | 2024 |
| Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback LJV Miranda*, Y Wang*, Y Elazar, S Kumar, V Pyatkin, F Brahman, ... Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025 | 23* | 2025 |
| Bridging the data provenance gap across text, speech and video S Longpre, N Singh, M Cherep, K Tiwary, J Materzynska, W Brannon, ... arXiv preprint arXiv:2412.17847, 2024 | 21* | 2024 |
| Multi hash embeddings in spaCy LJ Miranda*, Á Kádár*, A Boyd, S Van Landeghem, A Søgaard, ... arXiv preprint arXiv:2212.09255, 2022 | 13 | 2022 |
| R3: Robust rubric-agnostic reward models D Anugraha, Z Tang, LJV Miranda, H Zhao, MR Farhansyah, G Kuwanto, ... arXiv preprint arXiv:2505.13388, 2025 | 12* | 2025 |
| Crowdsource, crawl, or generate? creating sea-vl, a multicultural vision-language dataset for southeast asia S Cahyawijaya, H Lovenia, JRA Moniz, TH Wong, MR Farhansyah, ... Proceedings of the 63rd Annual Meeting of the Association for Computational …, 2025 | 10 | 2025 |
| Developing a Named Entity Recognition Dataset for Tagalog LJV Miranda Proceedings of the First Workshop in South East Asian Language Processing …, 2023 | 8 | 2023 |
| A deep learning approach based on stacked denoising autoencoders for protein function prediction LJ Miranda, J Hu 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC …, 2018 | 8 | 2018 |
| Appliance recognition using Hall Effect sensors and k-nearest neighbors for power management systems LJV Miranda, MJS Gutierrez, SMG Dumlao, RSJ Reyes 2016 IEEE Region 10 Conference (TENCON), 6-9, 2016 | 4 | 2016 |
| calamanCy: A Tagalog Natural Language Processing Toolkit LJV Miranda Proceedings of the 3rd Workshop for Natural Language Processing Open Source …, 2023 | 2 | 2023 |
| Olmo 3 T Olmo, A Ettinger, A Bertsch, B Kuehl, D Graham, D Heineman, ... arXiv preprint arXiv:2512.13961, 2025 | 1 | 2025 |
| FilBench: Can LLMs Understand and Generate Filipino? LJV Miranda*, E Aco*, C Manuel*, JCB Cruz, JM Imperial Proceedings of the 2025 Conference on Empirical Methods in Natural Language …, 2025 | 1 | 2025 |