Benjamin Clavié

Cited by

	All	Since 2021
Citations	952	921
h-index	9	9
i10-index	9	9

700

350

175

525

2018201920202021202220232024202520264 9 13 20 32 45 108 693 21

Co-authors

Benjamin WarnerSophontVerified email at sophontai.com
Antoine ChaffinLightOnVerified email at lighton.ai
Jeremy HowardHonorary Professor, University of QueenslandVerified email at zz.lc
Kobi GalBen-Gurion University, University of EdinburghVerified email at bgu.ac.il
Aamir ShakirMixedbreadVerified email at mixedbread.ai
Rikiya TAKEHIWaseda UniversityVerified email at fuji.waseda.jp
Makoto P. KatoUniversity of TsukubaVerified email at acm.org
Xianming LiPhD candidate@PolyU, Baking@Mixedbread, Ex Algorithm Engineer@AlipayVerified email at connect.polyu.hk

Benjamin Clavié

Mixedbread

Verified email at mixedbread.com - Homepage

Information Retrieval Natural Language Processing Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference B Warner, A Chaffin, B Clavié, O Weller, O Hallström, S Taghadouini, ... arXiv preprint arXiv:2412.13663, 2024	545	2024
Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification B Clavié, A Ciceu, F Naylor, G Soulié, T Brightwell 28TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE & INFORMATION SYSTEMS …, 2023	130	2023
Employing machine learning techniques for detection and classification of phishing emails N Moradpoor, B Clavie, B Buchanan 2017 computing conference, 149-156, 2017	93	2017
Edubert: Pretrained deep language models for learning analytics B Clavié, K Gal Companion Proceedings of the Tenth International Conference on Learning …, 2020	47	2020
Large Language Models as Batteries-Included Zero-Shot ESCO Skills Matchers B Clavié, G Soulié RecSys in HR’23: The 3rd Workshop on Recommender Systems for Human Resources …, 2023	44	2023
The unreasonable effectiveness of the baseline: discussing SVMs in legal text classification B Clavié, M Alphonsus Legal Knowledge and Information Systems: JURIX 2021, 2021	21	2021
rerankers: A lightweight python library to unify ranking methods B Clavié arXiv preprint arXiv:2408.17344, 2024	13*	2024
JaColBERTv2. 5: Optimising Multi-Vector Retrievers to Create State-of-the-Art Japanese Retrievers with Constrained Resources B Clavié Journal of Natural Language Processing 32 (1), 176-218, 2025	12	2025
Jacolbert and hard negatives, towards better japanese-first embeddings for retrieval: Early technical report B Clavié arXiv preprint arXiv:2312.16144, 2023	11*	2023
Reducing the footprint of multi-vector retrieval with minimal performance impact via token pooling B Clavié, A Chaffin, G Adams arXiv preprint arXiv:2409.14683, 2024	8	2024
LegaLMFiT: efficient short legal text classification with LSTM language model pre-training B Clavié, A Gheewala, P Briton, M Alphonsus, R Laabiyad, F Piccoli arXiv preprint arXiv:2109.00993, 2021	8	2021
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. CoRR abs/2412.13663 (2024) B Warner, A Chaffin, B Clavié, O Weller, O Hallström, S Taghadouini, ... arXiv preprint ARXIV.2412.13663, 2024	7	2024
Deep Embeddings of Contextual Assessment Data for Improving Performance Prediction. B Clavié, K Gal International Educational Data Mining Society, 2020	5	2020
Towards simple hybrid language model reasoning through human explanations enhanced prompts B Clavié, G Soulié, F Naylor, T Brightwell HHAI 2023: Augmenting human intellect, 379-381, 2023	4	2023
Fantastic (small) retrievers and how to train them: mxbai-edge-colbert-v0 tech report R Takehi, B Clavié, S Lee, A Shakir arXiv preprint arXiv:2510.14880, 2025	2	2025
It’s all in the [MASK]: Simple instruction-tuning enables BERT-like masked language models as generative classifiers B Clavié, N Cooper, B Warner Natural Language Processing Journal, 100150, 2025	2	2025
LIR: The First Workshop on Late Interaction and Multi Vector Retrieval@ ECIR 2026 B Clavié, X Li, A Chaffin, O Khattab, T Aarsen, M Faysse, J Li arXiv preprint arXiv:2511.00444, 2025		2025
Simple Projection Variants Improve ColBERT Performance B Clavié, S Lee, R Takehi, A Shakir, MP Kato arXiv preprint arXiv:2510.12327, 2025		2025
ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models B Clavié, F Brand arXiv preprint arXiv:2505.19091, 2025		2025

The system can't perform the operation now. Try again later.

Articles 1–19

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors