[go: up one dir, main page]

Skip to content
View FlyaZZZ's full-sized avatar

Block or report FlyaZZZ

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SGLang is a fast serving framework for large language models and vision language models.

Python 6,080 509 Updated Nov 17, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,653 427 Updated Nov 16, 2024

how to optimize some algorithm in cuda.

Cuda 1,591 131 Updated Nov 12, 2024

Yinghan's Code Sample

Cuda 288 54 Updated Jul 25, 2022

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

C++ 4,766 543 Updated Oct 24, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 636 26 Updated Sep 21, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,833 193 Updated Nov 16, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,609 205 Updated Nov 16, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,667 973 Updated Nov 8, 2024

hardware & software prefetcher

20 4 Updated Dec 21, 2023

LLM inference in C/C++

C++ 67,957 9,746 Updated Nov 17, 2024

Solve puzzles. Learn CUDA.

Jupyter Notebook 9,912 857 Updated Sep 1, 2024

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 34,430 4,245 Updated Nov 16, 2024

中文 CSL 样式

XML 5,131 832 Updated Nov 17, 2024

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge manageme…

TypeScript 44,630 10,016 Updated Nov 17, 2024

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Svelte 47,143 5,756 Updated Nov 17, 2024

A speculative mechanism to accelerate long-latency off-chip load requests by removing on-chip cache access latency from their critical path, as described by MICRO 2022 paper by Bera et al. (https:/…

C++ 68 12 Updated Sep 8, 2024

A Study of the SiFive Inclusive L2 Cache

C 44 11 Updated Dec 27, 2023

ChampSim is an open-source trace based simulator maintained at Texas A&M University and through the support of the computer architecture community.

C++ 520 432 Updated Nov 15, 2024

A CPU tool for benchmarking the peak of floating points

Assembly 502 123 Updated Oct 4, 2024

A customizable hardware prefetching framework using online reinforcement learning as described in the MICRO 2021 paper by Bera et al. (https://arxiv.org/pdf/2109.12021.pdf).

C++ 117 37 Updated May 22, 2024

Digital Design with Chisel

TeX 771 144 Updated Nov 7, 2024

Touying is a powerful package for creating presentation slides in Typst.

Typst 792 18 Updated Nov 14, 2024

A new markup-based typesetting system that is powerful and easy to learn.

Rust 35,160 939 Updated Nov 17, 2024

OpenNNA2.0,一个基于C语言(C99)的开源神经网络推理框架

C 65 6 Updated Aug 3, 2023

Termux - a terminal emulator application for Android OS extendible by variety of packages.

Java 36,507 3,835 Updated Oct 28, 2024

Datasets, Transforms and Models specific to Computer Vision

Python 16,260 6,956 Updated Nov 17, 2024
Next