- Hamburg, Germany
-
23:22
(UTC +02:00)
Stars
A collection of tricks to speed up LLMs
Identifying Mislabeled Instances inClassification Datasets
Understand and test language model architectures on synthetic tasks.
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/l…
Repository for pre-built dev container images published under mcr.microsoft.com/devcontainers
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Safe, portable, high performance compute (GPGPU) kernels.
Hardware accelerated, batchable and differentiable optimizers in JAX.
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers
Highly commented implementations of Transformers in PyTorch
The simplest, fastest repository for training/finetuning medium-sized GPTs.
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
Enforce the output format (JSON Schema, Regex etc) of a language model
🔊 Text-Prompted Generative Audio Model
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
A high-throughput and memory-efficient inference and serving engine for LLMs
libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.
Code for Spectral Norm of Convolutional Layers with Circular and Zero Paddings and Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration
[UAI 2023] Official PyTorch implementation of LRFormer: Mitigating Transformer Overconfidence via Lipschitz Regularization
[CVPR2023] A faster, smaller, and better text-to-image model for large-scale training
[ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy