[go: up one dir, main page]

Follow
Edward Beeching
Edward Beeching
Research Scientist, Hugging Face
Verified email at insa-lyon.fr - Homepage
Title
Cited by
Cited by
Year
Zephyr: Direct distillation of lm alignment
L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ...
arXiv preprint arXiv:2310.16944, 2023
8312023
Trl: Transformer reinforcement learning
L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert, ...
5942020
Open llm leaderboard
E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ...
4212023
Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems and solutions
J Li, E Beeching, L Tunstall, B Lipkin, R Soletskyi, S Huang, K Rasul, L Yu, ...
Hugging Face repository 13 (9), 9, 2024
2022024
Numinamath
LI Jia, E Beeching, L Tunstall, B Lipkin, R Soletskyi, SC Huang, K Rasul, ...
1212024
Optimizing test-time compute via meta reinforcement fine-tuning
Y Qu, MYR Yang, A Setlur, L Tunstall, EE Beeching, R Salakhutdinov, ...
arXiv preprint arXiv:2503.07572, 2025
842025
The alignment handbook
L Tunstall, E Beeching, N Lambert, N Rajani, S Huang, K Rasul, AM Rush, ...
URL https://github. com/huggingface/alignment-handbook 6, 2023
712023
Learning to plan with uncertain topological maps
E Beeching, J Dibangoye, O Simonin, C Wolf
European Conference on Computer Vision, 473-490, 2020
622020
Numinamath
J Li, E Beeching, L Tunstall, B Lipkin, R Soletskyi, SC Huang, K Rasul, ...
available at GitHub Repository Project Numina: https://github. com …, 2024
612024
No robots
N Rajani, L Tunstall, E Beeching, N Lambert, AM Rush, T Wolf
Hugging Face repository, 2023
502023
Zephyr: Direct distillation of lm alignment, 2023
L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ...
URL https://arxiv. org/abs/2310.16944 6, 2023
402023
Scaling test-time compute with open models
E Beeching, L Tunstall, S Rush
URL https://huggingface. co/spaces/HuggingFaceH4/blogpost-scaling-test-time …, 2024
332024
Deep reinforcement learning on a budget: 3d control and reasoning without a supercomputer
E Beeching, J Debangoye, O Simonin, C Wolf
2020 25th International Conference on Pattern Recognition (ICPR), 158-165, 2021
332021
Egomap: Projective mapping and structured egocentric memory for deep RL
E Beeching, J Dibangoye, O Simonin, C Wolf
Joint European conference on machine learning and knowledge discovery in …, 2020
312020
Creating a coding assistant with starcoder. Hugging Face Blog (2023)
L Tunstall, N Lambert, N Rajani, E Beeching, T Le Scao, L von Werra, ...
272023
Creating a coding assistant with starcoder
L Tunstall, N Lambert, N Rajani, E Beeching, T Le Scao, L von Werra, ...
Hugging Face Blog, 2023, 2023
252023
Godot reinforcement learning agents
E Beeching, J Debangoye, O Simonin, C Wolf
arXiv preprint arXiv:2112.03636, 2021
252021
StackLLaMA: An RL Finetuned LLaMA Model for Stack Exchange Question and Answering
E Beeching, Y Belkada, K Rasul, L Tunstall, L von Werra, N Rajani, ...
See https://huggingface. co/blog/stackllama (accessed 14 April 2023), 2023
222023
TRL: transformer reinforcement learning (2020)
L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert, ...
URL https://github. com/huggingface/trl, 0
20
Jack of all trades, master of some, a multi-purpose transformer agent
Q Gallouédec, E Beeching, C Romac, E Dellandréa
arXiv preprint arXiv:2402.09844, 2024
192024
The system can't perform the operation now. Try again later.
Articles 1–20