[go: up one dir, main page]

Skip to content
@compressionOrg

compressionOrg

Popular repositories Loading

  1. D-Pruner D-Pruner Public

    [NAACL Findings 2024] Pruning as a Domain-specific LLM Extractor. Support LLaMA2.

    Python 1 1

  2. sparsegpt sparsegpt Public

    Forked from IST-DASLab/sparsegpt

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python

  3. wanda wanda Public

    Forked from locuslab/wanda

    A simple and effective LLM pruning approach.

    Python

  4. FLAP FLAP Public

    Forked from CASIA-IVA-Lab/FLAP

    [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models

    Python

  5. TransformerCompression TransformerCompression Public

    Forked from microsoft/TransformerCompression

    For releasing code related to compression methods for transformers, accompanying our publications

    Python

  6. LLMPrune-BESA LLMPrune-BESA Public

    Forked from OpenGVLab/LLMPrune-BESA

    BESA is a differentiable weight pruning technique for large language models.

    Python

Repositories

Showing 10 of 36 repositories
  • Awesome-Efficient-LLM Public Forked from horseee/Awesome-Efficient-LLM

    A curated list for Efficient Large Language Models

    compressionOrg/Awesome-Efficient-LLM’s past year of commit activity
    Python 0 93 0 0 Updated Oct 30, 2024
  • OmniQuant Public Forked from OpenGVLab/OmniQuant

    [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

    compressionOrg/OmniQuant’s past year of commit activity
    Python 0 MIT 55 0 0 Updated Oct 16, 2024
  • QLLM Public Forked from ModelTC/QLLM

    [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

    compressionOrg/QLLM’s past year of commit activity
    Python 0 Apache-2.0 3 0 0 Updated Oct 15, 2024
  • RPTQ4LLM Public Forked from hahnyuan/RPTQ4LLM

    Reorder-based post-training quantization for large language model

    compressionOrg/RPTQ4LLM’s past year of commit activity
    Python 0 MIT 11 0 0 Updated Oct 14, 2024
  • distillm Public Forked from jongwooko/distillm

    Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)

    compressionOrg/distillm’s past year of commit activity
    Python 0 17 0 0 Updated Sep 20, 2024
  • LLM-QAT Public Forked from facebookresearch/LLM-QAT

    Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

    compressionOrg/LLM-QAT’s past year of commit activity
    Python 0 24 0 0 Updated Sep 3, 2024
  • OWL Public Forked from luuyin/OWL

    Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"

    compressionOrg/OWL’s past year of commit activity
    Jupyter Notebook 0 MIT 8 0 0 Updated Aug 20, 2024
  • llm-awq Public Forked from mit-han-lab/llm-awq

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    compressionOrg/llm-awq’s past year of commit activity
    Python 0 MIT 196 0 0 Updated Jul 16, 2024
  • GPTQ-for-LLaMa Public Forked from qwopqwop200/GPTQ-for-LLaMa

    4 bits quantization of LLaMA using GPTQ

    compressionOrg/GPTQ-for-LLaMa’s past year of commit activity
    Python 0 Apache-2.0 466 0 0 Updated Jul 13, 2024
  • smoothquant Public Forked from mit-han-lab/smoothquant

    [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

    compressionOrg/smoothquant’s past year of commit activity
    Python 0 MIT 145 0 0 Updated Jul 12, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…