Home
Browse all repos
/
Discover and explore top open-source AI tools and projects—updated daily.
Home
Browse all repos
Home
>
Users
>
Lianmin Zheng
Lianmin Zheng
Coauthor of SGLang, vLLM
GitHub
Authored Projects (1)
Starred
by
Chip Huyen
(Author of "AI Engineering", "Designing Machine Learning Systems")
,
Wei-Lin Chiang
(Cofounder of LMArena)
,
Ying Sheng
(Coauthor of SGLang)
,
Luis Capelo
(Cofounder of Lightning AI),
and
10 more.
awesome-tensor-compilers
by
merrymercy
0.1%
3k
Curated list of tensor compiler projects and papers
Optimizes deep learning computations across diverse hardware via advanced compilation techniques.
Features projects on intermediate representations, auto-tuning, cost models, and polyhedral optimization.
Covers dynamic shapes, quantization, sparsity, and distributed computing for ML workloads.
Showcases key frameworks like TVM, MLIR, and Triton for research and development.
Created 5 years ago
Updated 1 year ago
Starred Projects (93)
DeepSeek-V3.2-Exp
by
deepseek-ai
3.2%
961
Experimental LLM boosting long-context efficiency
Starred by
Created 1 month ago
Updated 1 month ago
GLM-4.5
by
zai-org
1.3%
3k
Foundation models for intelligent agents
Created 3 months ago
Updated 3 weeks ago
Kimi-K2
by
MoonshotAI
0.5%
8k
State-of-the-art MoE language model
Starred by
+4
Created 4 months ago
Updated 4 days ago
ome
by
sgl-project
1.0%
303
Kubernetes operator for LLM serving
Starred by
Created 5 months ago
Updated 1 day ago
slime
by
THUDM
3.3%
2k
LLM post-training framework for RL scaling
Starred by
+3
Created 4 months ago
Updated 1 day ago
RL2
by
ChenmienTan
0.4%
908
Reinforcement learning for large language models
Starred by
+1
Created 7 months ago
Updated 10 hours ago
Mooncake
by
kvcache-ai
0.9%
4k
Research paper on a disaggregated architecture for LLM serving
Starred by
+2
Created 1 year ago
Updated 11 hours ago
AReaL
by
inclusionAI
1.1%
3k
Distributed RL system for LLM reasoning
Starred by
Created 8 months ago
Updated 9 hours ago
LLaMA-Factory
by
hiyouga
1.3%
62k
Unified fine-tuning tool for 100+ LLMs & VLMs (ACL 2024)
Starred by
+25
Created 2 years ago
Updated 9 hours ago
verl
by
volcengine
1.6%
15k
RL training library for LLMs
Starred by
+14
Created 1 year ago
Updated 9 hours ago
DeepSeek-V3
by
deepseek-ai
0.1%
100k
MoE language model research paper with 671B total parameters
Starred by
+13
Created 10 months ago
Updated 2 months ago
gemlite
by
dropbox
1.3%
390
Triton kernels for efficient low-bit matrix multiplication
Starred by
Created 1 year ago
Updated 1 week ago
HunyuanVideo
by
Tencent-Hunyuan
0.3%
11k
PyTorch code for video generation research
Starred by
Created 11 months ago
Updated 2 months ago
MiniCPM
by
OpenBMB
0.1%
8k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Starred by
Created 1 year ago
Updated 3 weeks ago
nunchaku
by
nunchaku-tech
0.8%
3k
High-performance 4-bit diffusion model inference engine
Starred by
Created 1 year ago
Updated 1 week ago
deepcompressor
by
nunchaku-tech
0.6%
688
Model compression toolbox for LLMs and diffusion models
Starred by
Created 1 year ago
Updated 2 months ago
xgrammar
by
mlc-ai
0.6%
1k
Library for efficient structured generation
Starred by
+4
Created 1 year ago
Updated 10 hours ago
Awesome-ML-SYS-Tutorial
by
zhaochenyang20
1.0%
4k
ML SYS learning notes and code
Starred by
Created 1 year ago
Updated 4 weeks ago
OpenRLHF
by
OpenRLHF
0.6%
8k
RLHF framework for scalable training of large language models
Starred by
+9
Created 2 years ago
Updated 3 days ago
SageAttention
by
thu-ml
1.1%
3k
Attention kernel for plug-and-play inference acceleration
Starred by
Created 1 year ago
Updated 1 week ago
sgl-learning-materials
by
sgl-project
1.4%
630
Learning materials for SGLang, an efficient LLM serving engine
Starred by
Created 1 year ago
Updated 1 week ago
fish-speech
by
fishaudio
0.4%
24k
Open-source TTS for multilingual speech synthesis
Starred by
Created 2 years ago
Updated 22 hours ago
Liger-Kernel
by
linkedin
0.4%
6k
Triton kernels for efficient LLM training
Starred by
+8
Created 1 year ago
Updated 9 hours ago
ao
by
pytorch
0.8%
2k
PyTorch library for quantization and sparsity in training/inference
Starred by
+11
Created 2 years ago
Updated 14 hours ago
appl
by
appl-team
0%
263
A prompt programming language for Python
Created 1 year ago
Updated 8 months ago
ttt-lm-jax
by
test-time-training
0%
423
JAX implementation of test-time training RNN research paper
Starred by
Created 1 year ago
Updated 2 days ago
RouteLLM
by
lm-sys
0.3%
4k
Framework for LLM routing and cost reduction (research paper)
Starred by
+7
Created 1 year ago
Updated 1 year ago
DistServe
by
LLMServe
0.8%
715
Disaggregated serving system for LLMs
Created 1 year ago
Updated 7 months ago
gpt-fast
by
meta-pytorch
0.1%
6k
PyTorch text generation for efficient transformer inference
Starred by
+20
Created 2 years ago
Updated 2 months ago
lmdeploy
by
InternLM
0.3%
7k
Toolkit for LLM compression, deployment, and serving
Starred by
+8
Created 2 years ago
Updated 3 days ago
LLaVA-NeXT
by
LLaVA-VL
0.3%
4k
Multimodal model for image, video, and 3D understanding
Starred by
Created 1 year ago
Updated 1 month ago
arena-hard-auto
by
lmarena
0.2%
950
Automatic LLM benchmark for instruction-tuned models, correlating with human preference
Starred by
+6
Created 2 years ago
Updated 4 months ago
llama3
by
meta-llama
0.1%
29k
*Deprecated* minimal example for loading and running Llama 3 models
Starred by
+13
Created 1 year ago
Updated 9 months ago
llama.cpp
by
ggml-org
0.3%
89k
C/C++ library for local LLM inference
Starred by
+51
Created 2 years ago
Updated 12 hours ago
SWE-agent
by
SWE-agent
0.2%
18k
Agent for automated software engineering (NeurIPS 2024)
Starred by
+23
Created 1 year ago
Updated 20 hours ago
ollama
by
ollama
0.3%
155k
CLI tool for running LLMs locally
Starred by
+45
Created 2 years ago
Updated 9 hours ago
Consistency_LLM
by
hao-ai-lab
0%
405
Parallel decoder for efficient LLM inference
Starred by
Created 1 year ago
Updated 11 months ago
grok-1
by
xai-org
0.1%
51k
JAX example code for loading and running Grok-1 open-weights model
Starred by
+22
Created 1 year ago
Updated 1 year ago
flashinfer
by
flashinfer-ai
1.0%
4k
Kernel library for LLM serving
Starred by
+10
Created 2 years ago
Updated 15 hours ago
distrifuser
by
mit-han-lab
0%
709
Research paper for distributed parallel inference of high-resolution diffusion models
Starred by
Created 1 year ago
Updated 11 months ago
LWM
by
LargeWorldModel
0.0%
7k
Multimodal autoregressive model for long-context video/text
Starred by
+6
Created 1 year ago
Updated 1 year ago
Qwen3
by
QwenLM
0.3%
25k
Large language model series by Qwen team, Alibaba Cloud
Starred by
+11
Created 1 year ago
Updated 3 weeks ago
search_with_lepton
by
leptonai
0.0%
8k
Conversational search engine demo
Starred by
+9
Created 1 year ago
Updated 1 month ago
sglang
by
sgl-project
1.7%
20k
Fast serving framework for LLMs and vision language models
Starred by
+32
Created 1 year ago
Updated 9 hours ago
EAGLE
by
SafeAILab
0.5%
2k
Speculative decoding research paper for faster LLM inference
Starred by
+5
Created 1 year ago
Updated 3 weeks ago
llm-decontaminator
by
lm-sys
0%
311
LLM contamination detector for quantifying rephrased samples
Starred by
Created 2 years ago
Updated 1 year ago
ChatRTX
by
NVIDIA
0.1%
3k
Demo app for local RAG chatbot on Windows
Starred by
Created 2 years ago
Updated 7 months ago
S-LoRA
by
S-LoRA
0.1%
2k
System for scalable LoRA adapter serving
Starred by
+1
Created 2 years ago
Updated 1 year ago
Yi
by
01-ai
0.0%
8k
Open-source bilingual LLMs trained from scratch
Starred by
+7
Created 2 years ago
Updated 11 months ago
CTranslate2
by
OpenNMT
0.6%
4k
Fast inference engine for Transformer models
Starred by
+6
Created 6 years ago
Updated 1 day ago
DeepSpeed-MII
by
deepspeedai
0.1%
2k
Python library for high-throughput, low-latency, and cost-effective model inference
Starred by
+5
Created 3 years ago
Updated 4 months ago
ChatGLM3
by
zai-org
0.0%
14k
Bilingual chat LLM for complex scenarios (tool use, code execution, agents)
Starred by
Created 2 years ago
Updated 9 months ago
AgentTuning
by
THUDM
0.1%
1k
Agent tuning for generalized LLM agent abilities
Starred by
Created 2 years ago
Updated 2 years ago
TensorRT-LLM
by
NVIDIA
0.4%
12k
LLM inference optimization SDK for NVIDIA GPUs
Starred by
+17
Created 2 years ago
Updated 9 hours ago
guidance
by
guidance-ai
0.1%
21k
Guidance is a programming paradigm for steering LLMs
Starred by
+38
Created 3 years ago
Updated 3 weeks ago
Medusa
by
FasterDecoding
0.3%
3k
Framework for accelerating LLM generation using multiple decoding heads
Starred by
+6
Created 2 years ago
Updated 1 year ago
codellama
by
meta-llama
0.0%
16k
Inference code for CodeLlama models
Starred by
+12
Created 2 years ago
Updated 1 year ago
llm-attacks
by
llm-attacks
0.3%
4k
Attack framework for aligned LLMs, based on a research paper
Starred by
+3
Created 2 years ago
Updated 1 year ago
rerope
by
bojone
0.3%
381
Position embeddings research paper
Starred by
Created 2 years ago
Updated 1 year ago
LightLLM
by
ModelTC
0.5%
4k
Python framework for LLM inference and serving
Starred by
+6
Created 2 years ago
Updated 10 hours ago
long_llama
by
CStanKonrad
0%
1k
LLM for long context handling, fine-tuned with Focused Transformer
Starred by
Created 2 years ago
Updated 2 years ago
text-generation-inference
by
huggingface
0.1%
11k
Rust/Python/gRPC server for fast LLM text generation
Starred by
+35
Created 3 years ago
Updated 1 month ago
gorilla-cli
by
gorilla-llm
0%
1k
CLI tool using LLMs to generate commands
Starred by
Created 2 years ago
Updated 1 year ago
LongChat
by
DachengLi1
0%
531
Long-context LLM chatbot training and evaluation framework
Starred by
+2
Created 2 years ago
Updated 1 year ago
vllm
by
vllm-project
1.1%
62k
LLM serving engine for high-throughput, memory-efficient inference
Starred by
+57
Created 2 years ago
Updated 9 hours ago
WizardLM
by
nlpxucan
0.0%
9k
LLMs built using Evol-Instruct for complex instruction following
Starred by
+15
Created 2 years ago
Updated 5 months ago
mlc-llm
by
mlc-ai
0.1%
22k
Universal LLM deployment engine with ML compilation
Starred by
+21
Created 2 years ago
Updated 1 week ago
LLaVA
by
haotian-liu
0.2%
24k
Multimodal assistant with GPT-4 level capabilities
Starred by
+16
Created 2 years ago
Updated 1 year ago
MiniGPT-4
by
Vision-CAIR
0.0%
26k
Vision-language model for multi-task learning
Starred by
+15
Created 2 years ago
Updated 1 year ago
web-llm
by
mlc-ai
0.2%
17k
In-browser LLM inference engine using WebGPU for hardware acceleration
Starred by
+20
Created 2 years ago
Updated 2 days ago
EasyLM
by
young-geng
0%
2k
LLM training/finetuning framework in JAX/Flax
Starred by
+9
Created 2 years ago
Updated 1 year ago
zkml
by
ddkang
0%
362
Framework for proofs of ML model execution in ZK-SNARKs
Starred by
Created 2 years ago
Updated 1 year ago
llama
by
meta-llama
0.1%
59k
Inference code for Llama 2 models (deprecated)
Starred by
+38
Created 2 years ago
Updated 9 months ago
FastChat
by
lm-sys
0.1%
39k
Open platform for training, serving, and evaluating LLM-based chatbots
Starred by
+36
Created 2 years ago
Updated 5 months ago
web-stable-diffusion
by
mlc-ai
0.1%
4k
Browser-based Stable Diffusion demo with no server support
Starred by
+2
Created 2 years ago
Updated 1 year ago
FlexLLMGen
by
FMInference
0.0%
9k
High-throughput generation engine for LLMs with limited GPU memory
Starred by
+20
Created 2 years ago
Updated 1 year ago
smoothquant
by
mit-han-lab
0.4%
2k
Post-training quantization research paper for large language models
Starred by
+1
Created 3 years ago
Updated 1 year ago
dpm-solver
by
LuChengTHU
0.1%
2k
Fast ODE solver for diffusion probabilistic model sampling
Starred by
Created 3 years ago
Updated 1 year ago
GLM-130B
by
zai-org
0.0%
8k
Bilingual model for research and evaluation
Starred by
+6
Created 3 years ago
Updated 2 years ago
AITemplate
by
facebookincubator
0.0%
5k
Generate high-performance inference engines
Starred by
+19
Created 3 years ago
Updated 1 week ago
compiler-and-arch
by
KnowingNothing
0.2%
504
Compiler/architecture resources for emerging domains
Starred by
Created 3 years ago
Updated 9 months ago
alpa
by
alpa-projects
0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Starred by
+17
Created 4 years ago
Updated 1 year ago
skypilot
by
skypilot-org
0.3%
9k
Framework for cloud AI/batch jobs, unifying execution across diverse infrastructure
Starred by
+24
Created 4 years ago
Updated 9 hours ago
flexflow-train
by
flexflow
0.1%
2k
Accelerating distributed deep learning training
Starred by
+7
Created 7 years ago
Updated 1 week ago
aqueduct
by
RunLLM
0%
520
MLOps framework for cloud deployment of LLM/ML workloads
Starred by
+3
Created 3 years ago
Updated 2 years ago
paper-reading
by
mli
0.3%
32k
Deep learning paper readings
Starred by
+2
Created 4 years ago
Updated 7 months ago
vision_transformer
by
google-research
0.3%
12k
Vision Transformer and MLP-Mixer models in JAX/Flax
Starred by
+5
Created 5 years ago
Updated 8 months ago
flax
by
google
0.2%
7k
NN library for JAX, designed for flexibility in neural network research
Starred by
+19
Created 5 years ago
Updated 16 hours ago
awesome-tensor-compilers
by
merrymercy
0.1%
3k
Curated list of tensor compiler projects and papers
Starred by
+10
Created 5 years ago
Updated 1 year ago
antares
by
microsoft
0%
469
Compiler solution for PyTorch operator optimization on diverse accelerators
Starred by
Created 5 years ago
Updated 6 months ago
dgl
by
dmlc
0.1%
14k
Python package for deep learning on graphs
Starred by
+10
Created 7 years ago
Updated 3 months ago
tvm
by
apache
0.1%
13k
Compiler stack for deep learning systems
Starred by
+20
Created 9 years ago
Updated 12 hours ago
MARL-Papers
by
LantaoYu
0.2%
5k
Paper list for multi-agent reinforcement learning (MARL)
Starred by
Created 8 years ago
Updated 5 months ago
Feedback? Help us improve.