compressionOrg
Popular repositories Loading
-
sparsegpt
sparsegpt PublicForked from IST-DASLab/sparsegpt
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
Python
-
-
FLAP
FLAP PublicForked from CASIA-IVA-Lab/FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
Python
-
TransformerCompression
TransformerCompression PublicForked from microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
Python
-
LLMPrune-BESA
LLMPrune-BESA PublicForked from OpenGVLab/LLMPrune-BESA
BESA is a differentiable weight pruning technique for large language models.
Python
Repositories
- Awesome-Efficient-LLM Public Forked from horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
compressionOrg/Awesome-Efficient-LLM’s past year of commit activity - OmniQuant Public Forked from OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
compressionOrg/OmniQuant’s past year of commit activity - QLLM Public Forked from ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
compressionOrg/QLLM’s past year of commit activity - RPTQ4LLM Public Forked from hahnyuan/RPTQ4LLM
Reorder-based post-training quantization for large language model
compressionOrg/RPTQ4LLM’s past year of commit activity - distillm Public Forked from jongwooko/distillm
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
compressionOrg/distillm’s past year of commit activity - LLM-QAT Public Forked from facebookresearch/LLM-QAT
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"
compressionOrg/LLM-QAT’s past year of commit activity - OWL Public Forked from luuyin/OWL
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
compressionOrg/OWL’s past year of commit activity - llm-awq Public Forked from mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
compressionOrg/llm-awq’s past year of commit activity - GPTQ-for-LLaMa Public Forked from qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
compressionOrg/GPTQ-for-LLaMa’s past year of commit activity - smoothquant Public Forked from mit-han-lab/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
compressionOrg/smoothquant’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…