X
2,142
Home
Browse all repos
/
Discover and explore top open-source AI tools and projects—updated daily.
Home
Browse all repos
Home
>
Users
>
Lianmin Zheng
Lianmin Zheng
Coauthor of SGLang, vLLM
GitHub
Starred Projects (89)
GLM-4.5
by
zai-org
3.6%
3k
Foundation models for intelligent agents
Created 2 months ago
Updated 3 days ago
Kimi-K2
by
MoonshotAI
1.7%
8k
State-of-the-art MoE language model
Starred by
+3
Created 2 months ago
Updated 1 week ago
ome
by
sgl-project
2.6%
271
Kubernetes operator for LLM serving
Starred by
Created 4 months ago
Updated 18 hours ago
slime
by
THUDM
4.5%
2k
LLM post-training framework for RL scaling
Starred by
+2
Created 3 months ago
Updated 14 hours ago
RL2
by
ChenmienTan
0.5%
865
Reinforcement learning for large language models
Starred by
+1
Created 5 months ago
Updated 6 days ago
Mooncake
by
kvcache-ai
1.3%
4k
Research paper on a disaggregated architecture for LLM serving
Starred by
+2
Created 1 year ago
Updated 16 hours ago
AReaL
by
inclusionAI
1.5%
3k
Distributed RL system for LLM reasoning
Starred by
Created 6 months ago
Updated 14 hours ago
LLaMA-Factory
by
hiyouga
1.1%
58k
Unified fine-tuning tool for 100+ LLMs & VLMs (ACL 2024)
Starred by
+21
Created 2 years ago
Updated 2 days ago
verl
by
volcengine
1.9%
13k
RL training library for LLMs
Starred by
+13
Created 10 months ago
Updated 21 hours ago
DeepSeek-V3
by
deepseek-ai
0.1%
99k
MoE language model research paper with 671B total parameters
Starred by
+13
Created 8 months ago
Updated 3 weeks ago
gemlite
by
mobiusml
1.4%
369
Triton kernels for efficient low-bit matrix multiplication
Starred by
Created 1 year ago
Updated 4 days ago
HunyuanVideo
by
Tencent-Hunyuan
0.2%
11k
PyTorch code for video generation research
Starred by
Created 9 months ago
Updated 3 weeks ago
MiniCPM
by
OpenBMB
0.4%
8k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Starred by
Created 1 year ago
Updated 1 week ago
nunchaku
by
nunchaku-tech
1.9%
3k
High-performance 4-bit diffusion model inference engine
Starred by
Created 10 months ago
Updated 2 days ago
deepcompressor
by
nunchaku-tech
0.9%
632
Model compression toolbox for LLMs and diffusion models
Starred by
Created 1 year ago
Updated 1 month ago
xgrammar
by
mlc-ai
2.4%
1k
Library for efficient structured generation
Starred by
+4
Created 1 year ago
Updated 22 hours ago
Awesome-ML-SYS-Tutorial
by
zhaochenyang20
1.6%
4k
ML SYS learning notes and code
Starred by
Created 10 months ago
Updated 1 day ago
OpenRLHF
by
OpenRLHF
0.7%
8k
RLHF framework for scalable training of large language models
Starred by
+8
Created 2 years ago
Updated 3 days ago
SageAttention
by
thu-ml
1.3%
2k
Attention kernel for plug-and-play inference acceleration
Starred by
Created 11 months ago
Updated 1 month ago
sgl-learning-materials
by
sgl-project
1.4%
578
Learning materials for SGLang, an efficient LLM serving engine
Starred by
Created 1 year ago
Updated 2 weeks ago
fish-speech
by
fishaudio
0.3%
23k
Open-source TTS for multilingual speech synthesis
Starred by
Created 1 year ago
Updated 1 week ago
Liger-Kernel
by
linkedin
0.6%
6k
Triton kernels for efficient LLM training
Starred by
+8
Created 1 year ago
Updated 1 day ago
ao
by
pytorch
0.6%
2k
PyTorch library for quantization and sparsity in training/inference
Starred by
+9
Created 1 year ago
Updated 22 hours ago
appl
by
appl-team
0.4%
261
A prompt programming language for Python
Created 1 year ago
Updated 7 months ago
ttt-lm-jax
by
test-time-training
0.2%
422
JAX implementation of test-time training RNN research paper
Starred by
Created 1 year ago
Updated 1 year ago
RouteLLM
by
lm-sys
0.3%
4k
Framework for LLM routing and cost reduction (research paper)
Starred by
+6
Created 1 year ago
Updated 1 year ago
DistServe
by
LLMServe
0.3%
688
Disaggregated serving system for LLMs
Created 1 year ago
Updated 5 months ago
gpt-fast
by
meta-pytorch
0.2%
6k
PyTorch text generation for efficient transformer inference
Starred by
+20
Created 1 year ago
Updated 3 weeks ago
lmdeploy
by
InternLM
0.7%
7k
Toolkit for LLM compression, deployment, and serving
Starred by
+8
Created 2 years ago
Updated 16 hours ago
LLaVA-NeXT
by
LLaVA-VL
0.7%
4k
Multimodal model for image, video, and 3D understanding
Starred by
Created 1 year ago
Updated 4 days ago
arena-hard-auto
by
lmarena
0.4%
925
Automatic LLM benchmark for instruction-tuned models, correlating with human preference
Starred by
+5
Created 1 year ago
Updated 2 months ago
llama3
by
meta-llama
0.1%
29k
*Deprecated* minimal example for loading and running Llama 3 models
Starred by
+13
Created 1 year ago
Updated 7 months ago
llama.cpp
by
ggml-org
0.4%
87k
C/C++ library for local LLM inference
Starred by
+51
Created 2 years ago
Updated 14 hours ago
SWE-agent
by
SWE-agent
0.5%
17k
Agent for automated software engineering (NeurIPS 2024)
Starred by
+23
Created 1 year ago
Updated 3 days ago
ollama
by
ollama
0.3%
152k
CLI tool for running LLMs locally
Starred by
+42
Created 2 years ago
Updated 20 hours ago
Consistency_LLM
by
hao-ai-lab
0.3%
404
Parallel decoder for efficient LLM inference
Starred by
Created 1 year ago
Updated 10 months ago
grok-1
by
xai-org
0.1%
51k
JAX example code for loading and running Grok-1 open-weights model
Starred by
+22
Created 1 year ago
Updated 1 year ago
flashinfer
by
flashinfer-ai
1.0%
4k
Kernel library for LLM serving
Starred by
+10
Created 2 years ago
Updated 23 hours ago
distrifuser
by
mit-han-lab
0.1%
705
Research paper for distributed parallel inference of high-resolution diffusion models
Starred by
Created 1 year ago
Updated 9 months ago
LWM
by
LargeWorldModel
0.1%
7k
Multimodal autoregressive model for long-context video/text
Starred by
+6
Created 1 year ago
Updated 11 months ago
Qwen3
by
QwenLM
0.4%
25k
Large language model series by Qwen team, Alibaba Cloud
Starred by
+10
Created 1 year ago
Updated 2 weeks ago
search_with_lepton
by
leptonai
0.0%
8k
Conversational search engine demo
Starred by
+9
Created 1 year ago
Updated 2 weeks ago
sglang
by
sgl-project
1.2%
18k
Fast serving framework for LLMs and vision language models
Starred by
+32
Created 1 year ago
Updated 14 hours ago
EAGLE
by
SafeAILab
10.6%
2k
Speculative decoding research paper for faster LLM inference
Starred by
+5
Created 1 year ago
Updated 1 week ago
llm-decontaminator
by
lm-sys
0.3%
310
LLM contamination detector for quantifying rephrased samples
Starred by
Created 1 year ago
Updated 1 year ago
ChatRTX
by
NVIDIA
0.3%
3k
Demo app for local RAG chatbot on Windows
Starred by
Created 1 year ago
Updated 5 months ago
S-LoRA
by
S-LoRA
0.2%
2k
System for scalable LoRA adapter serving
Starred by
+1
Created 1 year ago
Updated 1 year ago
Yi
by
01-ai
0%
8k
Open-source bilingual LLMs trained from scratch
Starred by
+7
Created 1 year ago
Updated 9 months ago
CTranslate2
by
OpenNMT
0.3%
4k
Fast inference engine for Transformer models
Starred by
+6
Created 6 years ago
Updated 5 months ago
DeepSpeed-MII
by
deepspeedai
0.1%
2k
Python library for high-throughput, low-latency, and cost-effective model inference
Starred by
+5
Created 3 years ago
Updated 2 months ago
ChatGLM3
by
zai-org
0.0%
14k
Bilingual chat LLM for complex scenarios (tool use, code execution, agents)
Starred by
Created 1 year ago
Updated 8 months ago
AgentTuning
by
THUDM
0%
1k
Agent tuning for generalized LLM agent abilities
Starred by
Created 1 year ago
Updated 1 year ago
TensorRT-LLM
by
NVIDIA
0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Starred by
+17
Created 2 years ago
Updated 14 hours ago
guidance
by
guidance-ai
0.1%
21k
Guidance is a programming paradigm for steering LLMs
Starred by
+38
Created 2 years ago
Updated 1 day ago
Medusa
by
FasterDecoding
0.3%
3k
Framework for accelerating LLM generation using multiple decoding heads
Starred by
+6
Created 2 years ago
Updated 1 year ago
codellama
by
meta-llama
0.0%
16k
Inference code for CodeLlama models
Starred by
+12
Created 2 years ago
Updated 1 year ago
llm-attacks
by
llm-attacks
0.2%
4k
Attack framework for aligned LLMs, based on a research paper
Starred by
+3
Created 2 years ago
Updated 1 year ago
rerope
by
bojone
0%
381
Position embeddings research paper
Starred by
Created 2 years ago
Updated 1 year ago
LightLLM
by
ModelTC
0.5%
4k
Python framework for LLM inference and serving
Starred by
+6
Created 2 years ago
Updated 14 hours ago
long_llama
by
CStanKonrad
0%
1k
LLM for long context handling, fine-tuned with Focused Transformer
Starred by
Created 2 years ago
Updated 1 year ago
text-generation-inference
by
huggingface
0.2%
11k
Rust/Python/gRPC server for fast LLM text generation
Starred by
+34
Created 2 years ago
Updated 1 day ago
gorilla-cli
by
gorilla-llm
0.1%
1k
CLI tool using LLMs to generate commands
Starred by
Created 2 years ago
Updated 1 year ago
LongChat
by
DachengLi1
0.2%
533
Long-context LLM chatbot training and evaluation framework
Starred by
+2
Created 2 years ago
Updated 1 year ago
vllm
by
vllm-project
1.1%
58k
LLM serving engine for high-throughput, memory-efficient inference
Starred by
+55
Created 2 years ago
Updated 14 hours ago
WizardLM
by
nlpxucan
0.0%
9k
LLMs built using Evol-Instruct for complex instruction following
Starred by
+15
Created 2 years ago
Updated 3 months ago
mlc-llm
by
mlc-ai
0.3%
21k
Universal LLM deployment engine with ML compilation
Starred by
+21
Created 2 years ago
Updated 1 day ago
LLaVA
by
haotian-liu
0.2%
24k
Multimodal assistant with GPT-4 level capabilities
Starred by
+16
Created 2 years ago
Updated 1 year ago
MiniGPT-4
by
Vision-CAIR
0.0%
26k
Vision-language model for multi-task learning
Starred by
+14
Created 2 years ago
Updated 1 year ago
web-llm
by
mlc-ai
0.2%
16k
In-browser LLM inference engine using WebGPU for hardware acceleration
Starred by
+19
Created 2 years ago
Updated 5 days ago
EasyLM
by
young-geng
0.0%
2k
LLM training/finetuning framework in JAX/Flax
Starred by
+9
Created 2 years ago
Updated 1 year ago
zkml
by
ddkang
0%
360
Framework for proofs of ML model execution in ZK-SNARKs
Starred by
Created 2 years ago
Updated 1 year ago
llama
by
meta-llama
0.1%
59k
Inference code for Llama 2 models (deprecated)
Starred by
+37
Created 2 years ago
Updated 7 months ago
FastChat
by
lm-sys
0.1%
39k
Open platform for training, serving, and evaluating LLM-based chatbots
Starred by
+35
Created 2 years ago
Updated 3 months ago
web-stable-diffusion
by
mlc-ai
0.0%
4k
Browser-based Stable Diffusion demo with no server support
Starred by
+2
Created 2 years ago
Updated 1 year ago
FlexLLMGen
by
FMInference
0.1%
9k
High-throughput generation engine for LLMs with limited GPU memory
Starred by
+20
Created 2 years ago
Updated 10 months ago
smoothquant
by
mit-han-lab
0.3%
2k
Post-training quantization research paper for large language models
Starred by
+1
Created 2 years ago
Updated 1 year ago
dpm-solver
by
LuChengTHU
0.2%
2k
Fast ODE solver for diffusion probabilistic model sampling
Starred by
Created 3 years ago
Updated 1 year ago
GLM-130B
by
zai-org
0.0%
8k
Bilingual model for research and evaluation
Starred by
+6
Created 3 years ago
Updated 2 years ago
compiler-and-arch
by
KnowingNothing
0%
486
Compiler/architecture resources for emerging domains
Starred by
Created 3 years ago
Updated 8 months ago
alpa
by
alpa-projects
0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Starred by
+17
Created 4 years ago
Updated 1 year ago
skypilot
by
skypilot-org
0.5%
9k
Framework for cloud AI/batch jobs, unifying execution across diverse infrastructure
Starred by
+24
Created 4 years ago
Updated 16 hours ago
aqueduct
by
RunLLM
0%
520
MLOps framework for cloud deployment of LLM/ML workloads
Starred by
+3
Created 3 years ago
Updated 2 years ago
paper-reading
by
mli
0.2%
31k
Deep learning paper readings
Starred by
+2
Created 3 years ago
Updated 6 months ago
vision_transformer
by
google-research
0.2%
12k
Vision Transformer and MLP-Mixer models in JAX/Flax
Starred by
+5
Created 4 years ago
Updated 6 months ago
flax
by
google
0.2%
7k
NN library for JAX, designed for flexibility in neural network research
Starred by
+19
Created 5 years ago
Updated 19 hours ago
antares
by
microsoft
0%
467
Compiler solution for PyTorch operator optimization on diverse accelerators
Starred by
Created 5 years ago
Updated 5 months ago
dgl
by
dmlc
0.1%
14k
Python package for deep learning on graphs
Starred by
+10
Created 7 years ago
Updated 1 month ago
tvm
by
apache
0.3%
13k
Compiler stack for deep learning systems
Starred by
+19
Created 9 years ago
Updated 2 days ago
MARL-Papers
by
LantaoYu
0.2%
5k
Paper list for multi-agent reinforcement learning (MARL)
Starred by
Created 8 years ago
Updated 3 months ago
Feedback? Help us improve.