Stars
[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challen…
A curriculum for learning about foundation models, from scratch to the frontier
[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
My writings about ARC (Abstraction and Reasoning Corpus)
The fastest way to make and track predictions
A map of the AI alignment landscape
Squiggle programming language for intuitive probabilistic estimation features in Python
Experimental models by the QURI team & others
Manifold Markets: A market for every question
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Model parallel transformers in JAX and Haiku
TruthfulQA: Measuring How Models Imitate Human Falsehoods
Evaluation suite for large-scale language models.
Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings"
Measuring Massive Multitask Language Understanding | ICLR 2021
Aligning AI With Shared Human Values (ICLR 2021)
Fetch forecasts from prediction markets/forecasting platforms to make them searchable. Integrate these forecasts into other services.
Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training