X
2,142
Home
Browse all repos
/
Discover and explore top open-source AI tools and projects—updated daily.
Home
Browse all repos
Home
>
Users
>
Zhuohan Li
Zhuohan Li
Coauthor of vLLM
GitHub
Starred Projects (98)
checkpoint-engine
by
MoonshotAI
42.9%
701
Middleware for efficient LLM weight updates during inference
Starred by
+3
Created 1 week ago
Updated 1 day ago
batch_invariant_ops
by
thinking-machines-lab
65.1%
636
Enhance LLM inference determinism
Starred by
+1
Created 1 week ago
Updated 1 week ago
wechat-bot
by
wangrongding
0.3%
9k
WeChat bot integrating multiple AI services
Created 3 years ago
Updated 3 weeks ago
harmony
by
openai
0.5%
4k
Renderer for OpenAI's harmony response format
Starred by
+9
Created 1 month ago
Updated 1 month ago
gpt-oss
by
openai
0.7%
18k
Open-weight LLMs for reasoning and agents
Starred by
+14
Created 2 months ago
Updated 3 days ago
mirage
by
mirage-project
2.2%
2k
Tool for fast GPU kernel generation via superoptimization
Starred by
+1
Created 1 year ago
Updated 1 day ago
Hunyuan3D-2.1
by
Tencent-Hunyuan
2.6%
2k
Image to 3D asset generation with PBR materials
Starred by
Created 3 months ago
Updated 1 week ago
torchtitan
by
pytorch
0.7%
4k
PyTorch platform for generative AI model training research
Starred by
+10
Created 1 year ago
Updated 21 hours ago
tilelang
by
tile-ai
1.5%
2k
DSL for high-performance GPU/CPU kernel development (GEMM, attention, etc.)
Starred by
Created 11 months ago
Updated 15 hours ago
3FS
by
deepseek-ai
0.2%
9k
Distributed file system for AI training/inference workloads
Starred by
+5
Created 6 months ago
Updated 1 week ago
DeepGEMM
by
deepseek-ai
0.3%
6k
CUDA library for efficient FP8 GEMM kernels with fine-grained scaling
Starred by
+3
Created 7 months ago
Updated 6 days ago
open-infra-index
by
deepseek-ai
0.1%
8k
AI infrastructure tools for efficient AGI development
Starred by
+15
Created 6 months ago
Updated 4 months ago
mochi
by
genmoai
0.7%
3k
Video generation model
Starred by
+5
Created 1 year ago
Updated 1 week ago
llm-compressor
by
vllm-project
1.4%
2k
Transformers-compatible library for LLM compression, optimized for vLLM deployment
Starred by
Created 1 year ago
Updated 1 day ago
Nanoflow
by
efeslab
0.5%
891
LLM serving framework for high throughput
Starred by
Created 1 year ago
Updated 1 day ago
vattention
by
microsoft
0.5%
416
Memory manager for LLM serving systems
Created 1 year ago
Updated 3 months ago
lm-evaluation-harness
by
EleutherAI
0.7%
10k
Framework for few-shot language model evaluation
Starred by
+16
Created 5 years ago
Updated 2 days ago
mistral.rs
by
EricLBuehler
0.3%
6k
LLM inference engine for blazing fast performance
Starred by
+8
Created 1 year ago
Updated 1 day ago
OpenHands
by
All-Hands-AI
0.3%
64k
AI platform for software development agents
Starred by
+36
Created 1 year ago
Updated 16 hours ago
ThunderKittens
by
HazyResearch
0.6%
3k
CUDA kernel framework for fast deep learning primitives
Starred by
+12
Created 1 year ago
Updated 2 days ago
arena-hard-auto
by
lmarena
0.4%
925
Automatic LLM benchmark for instruction-tuned models, correlating with human preference
Starred by
+5
Created 1 year ago
Updated 2 months ago
simple-evals
by
openai
0.3%
4k
Lightweight library for evaluating language models
Starred by
+13
Created 1 year ago
Updated 1 month ago
calm
by
zeux
0.3%
613
Single-GPU inference engine for rapid LLM prototyping
Starred by
Created 1 year ago
Updated 3 months ago
dspy
by
stanfordnlp
0.8%
28k
Framework for programming language models, not prompting
Starred by
+49
Created 2 years ago
Updated 15 hours ago
Consistency_LLM
by
hao-ai-lab
0.3%
404
Parallel decoder for efficient LLM inference
Starred by
Created 1 year ago
Updated 10 months ago
grok-1
by
xai-org
0.1%
51k
JAX example code for loading and running Grok-1 open-weights model
Starred by
+22
Created 1 year ago
Updated 1 year ago
mlc-llm
by
mlc-ai
0.3%
21k
Universal LLM deployment engine with ML compilation
Starred by
+21
Created 2 years ago
Updated 1 day ago
kserve
by
kserve
0.6%
5k
Kubernetes CRD for scalable ML model serving
Starred by
Created 6 years ago
Updated 1 day ago
LMFlow
by
OptimalScale
0.0%
8k
Toolkit for finetuning and inference of large foundation models
Starred by
+9
Created 2 years ago
Updated 1 month ago
TransformerEngine
by
NVIDIA
0.4%
3k
Library for Transformer model acceleration on NVIDIA GPUs
Starred by
+4
Created 3 years ago
Updated 20 hours ago
LWM
by
LargeWorldModel
0.1%
7k
Multimodal autoregressive model for long-context video/text
Starred by
+6
Created 1 year ago
Updated 11 months ago
search_with_lepton
by
leptonai
0.0%
8k
Conversational search engine demo
Starred by
+9
Created 1 year ago
Updated 2 weeks ago
llama_index
by
run-llama
0.3%
44k
Data framework for building LLM-powered agents
Starred by
+41
Created 2 years ago
Updated 19 hours ago
marlin
by
IST-DASLab
0.3%
898
FP16xINT4 kernel for fast LLM inference
Starred by
Created 1 year ago
Updated 1 year ago
sglang
by
sgl-project
1.2%
18k
Fast serving framework for LLMs and vision language models
Starred by
+32
Created 1 year ago
Updated 14 hours ago
LLaVA
by
haotian-liu
0.2%
24k
Multimodal assistant with GPT-4 level capabilities
Starred by
+16
Created 2 years ago
Updated 1 year ago
megablocks
by
databricks
0.6%
1k
Lightweight library for mixture-of-experts (MoE) training
Starred by
+15
Created 2 years ago
Updated 2 months ago
flashinfer
by
flashinfer-ai
1.0%
4k
Kernel library for LLM serving
Starred by
+10
Created 2 years ago
Updated 23 hours ago
gpt-fast
by
meta-pytorch
0.2%
6k
PyTorch text generation for efficient transformer inference
Starred by
+20
Created 1 year ago
Updated 3 weeks ago
LookaheadDecoding
by
hao-ai-lab
0.2%
1k
Parallel decoding algorithm for faster LLM inference
Starred by
Created 1 year ago
Updated 6 months ago
axolotl
by
axolotl-ai-cloud
0.5%
10k
CLI tool for streamlined post-training of AI models
Starred by
+23
Created 2 years ago
Updated 15 hours ago
TensorRT-LLM
by
NVIDIA
0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Starred by
+17
Created 2 years ago
Updated 14 hours ago
letta
by
letta-ai
0.4%
18k
Agent framework for stateful agents with memory, reasoning, and context management
Starred by
+16
Created 1 year ago
Updated 1 day ago
streaming-llm
by
mit-han-lab
0.1%
7k
Framework for efficient LLM streaming
Starred by
+2
Created 2 years ago
Updated 1 year ago
llm-engine
by
scaleapi
0%
811
Open-source engine for fine-tuning and serving LLMs
Starred by
+3
Created 2 years ago
Updated 1 day ago
scalene
by
plasma-umass
0.2%
13k
Python profiler with AI-powered optimization proposals
Starred by
+14
Created 5 years ago
Updated 1 week ago
Medusa
by
FasterDecoding
0.3%
3k
Framework for accelerating LLM generation using multiple decoding heads
Starred by
+6
Created 2 years ago
Updated 1 year ago
outlines
by
dottxt-ai
0.3%
13k
SDK for structured LLM text generation
Starred by
+34
Created 2 years ago
Updated 16 hours ago
llm-awq
by
mit-han-lab
0.3%
3k
Weight quantization research paper for LLM compression/acceleration
Starred by
+4
Created 2 years ago
Updated 2 months ago
llama-cookbook
by
meta-llama
0.3%
18k
Guide for building with Llama models
Starred by
+14
Created 2 years ago
Updated 1 day ago
openchat
by
imoneoi
0.0%
5k
Open-source LLM fine-tuned with C-RLFT, inspired by offline reinforcement learning
Starred by
+4
Created 2 years ago
Updated 1 year ago
flash-attention
by
Dao-AILab
0.6%
20k
Fast, memory-efficient attention implementation
Starred by
+31
Created 3 years ago
Updated 1 day ago
Dromedary
by
IBM
0%
1k
Self-aligned language model research paper with minimal human supervision
Starred by
Created 2 years ago
Updated 23 hours ago
LLMSurvey
by
RUCAIBox
0.2%
12k
Survey paper for large language models
Starred by
+2
Created 2 years ago
Updated 6 months ago
vllm
by
vllm-project
1.1%
58k
LLM serving engine for high-throughput, memory-efficient inference
Starred by
+55
Created 2 years ago
Updated 14 hours ago
tabby
by
TabbyML
0.1%
32k
Self-hosted AI coding assistant for on-prem code completion
Starred by
+17
Created 2 years ago
Updated 3 weeks ago
LongChat
by
DachengLi1
0.2%
533
Long-context LLM chatbot training and evaluation framework
Starred by
+2
Created 2 years ago
Updated 1 year ago
gorilla
by
ShishirPatil
0.2%
12k
LLM tool-use framework for API invocation and function calling
Starred by
+15
Created 2 years ago
Updated 22 hours ago
gorilla-cli
by
gorilla-llm
0.1%
1k
CLI tool using LLMs to generate commands
Starred by
Created 2 years ago
Updated 1 year ago
llama.cpp
by
ggml-org
0.4%
87k
C/C++ library for local LLM inference
Starred by
+51
Created 2 years ago
Updated 14 hours ago
ray-llm
by
ray-project
0%
1k
LLM deployment framework on Ray (now upstreamed to Ray)
Starred by
+2
Created 2 years ago
Updated 6 months ago
peft
by
huggingface
0.3%
20k
Parameter-efficient fine-tuning (PEFT) library
Starred by
+16
Created 2 years ago
Updated 2 days ago
bitsandbytes
by
bitsandbytes-foundation
0.3%
8k
PyTorch library for k-bit quantization, enabling accessible LLMs
Starred by
+26
Created 4 years ago
Updated 2 days ago
ctransformers
by
marella
0.1%
2k
Python bindings for fast Transformer model inference
Starred by
+8
Created 2 years ago
Updated 1 year ago
CTranslate2
by
OpenNMT
0.3%
4k
Fast inference engine for Transformer models
Starred by
+6
Created 6 years ago
Updated 5 months ago
EasyLM
by
young-geng
0.0%
2k
LLM training/finetuning framework in JAX/Flax
Starred by
+9
Created 2 years ago
Updated 1 year ago
open_llama
by
openlm-research
0.1%
8k
Open-source reproduction of LLaMA models
Starred by
+14
Created 2 years ago
Updated 2 years ago
text-generation-inference
by
huggingface
0.2%
11k
Rust/Python/gRPC server for fast LLM text generation
Starred by
+34
Created 2 years ago
Updated 1 day ago
langchain
by
langchain-ai
0.4%
116k
Framework for building LLM-powered applications
Starred by
+80
Created 2 years ago
Updated 1 day ago
web-llm
by
mlc-ai
0.2%
16k
In-browser LLM inference engine using WebGPU for hardware acceleration
Starred by
+19
Created 2 years ago
Updated 5 days ago
FasterTransformer
by
NVIDIA
0.1%
6k
Optimized transformer library for inference
Starred by
+12
Created 4 years ago
Updated 1 year ago
FastChat
by
lm-sys
0.1%
39k
Open platform for training, serving, and evaluating LLM-based chatbots
Starred by
+35
Created 2 years ago
Updated 3 months ago
llama
by
meta-llama
0.1%
59k
Inference code for Llama 2 models (deprecated)
Starred by
+37
Created 2 years ago
Updated 7 months ago
FlexLLMGen
by
FMInference
0.1%
9k
High-throughput generation engine for LLMs with limited GPU memory
Starred by
+20
Created 2 years ago
Updated 10 months ago
PiPPy
by
pytorch
0%
779
PyTorch tool for pipeline parallelism
Starred by
+3
Created 3 years ago
Updated 1 year ago
compiler-and-arch
by
KnowingNothing
0%
486
Compiler/architecture resources for emerging domains
Starred by
Created 3 years ago
Updated 8 months ago
skypilot
by
skypilot-org
0.5%
9k
Framework for cloud AI/batch jobs, unifying execution across diverse infrastructure
Starred by
+24
Created 4 years ago
Updated 16 hours ago
paxml
by
google
0.8%
536
Jax-based ML framework for large-scale model training and experimentation
Starred by
Created 3 years ago
Updated 2 weeks ago
alpa
by
alpa-projects
0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Starred by
+17
Created 4 years ago
Updated 1 year ago
DeepSpeed
by
deepspeedai
0.2%
40k
Deep learning optimization library for distributed training and inference
Starred by
+35
Created 5 years ago
Updated 1 day ago
DPR
by
facebookresearch
0.1%
2k
Dense Passage Retriever for open-domain Q&A research
Starred by
+4
Created 5 years ago
Updated 2 years ago
pytorch-lightning
by
Lightning-AI
0.1%
30k
Deep learning framework for pretraining, finetuning, and deploying AI models
Starred by
+29
Created 6 years ago
Updated 1 day ago
faiss
by
facebookresearch
0.3%
37k
Similarity search library for dense vectors
Starred by
+52
Created 8 years ago
Updated 1 week ago
tvm
by
apache
0.3%
13k
Compiler stack for deep learning systems
Starred by
+19
Created 9 years ago
Updated 2 days ago
universal-triggers
by
Eric-Wallace
0.3%
297
NLP attack/analysis research paper (EMNLP 2019)
Starred by
Created 6 years ago
Updated 1 year ago
gdrcopy
by
NVIDIA
1.2%
1k
GPU memory copy library using GPUDirect RDMA
Starred by
Created 10 years ago
Updated 1 month ago
Megatron-LM
by
NVIDIA
0.5%
14k
Framework for training transformer models at scale
Starred by
+18
Created 6 years ago
Updated 15 hours ago
DeepLearningExamples
by
NVIDIA
0.1%
14k
Deep learning examples for training and deployment
Starred by
+8
Created 7 years ago
Updated 1 year ago
gpt-2
by
openai
0.1%
24k
Code for research paper "Language Models are Unsupervised Multitask Learners"
Starred by
+27
Created 6 years ago
Updated 1 year ago
fairseq
by
facebookresearch
0.1%
32k
Sequence modeling toolkit for translation, language modeling, and text generation research
Starred by
+41
Created 8 years ago
Updated 1 week ago
rl_a3c_pytorch
by
dgriff777
0%
571
PyTorch implementation of A3C for Atari games
Starred by
+2
Created 8 years ago
Updated 2 years ago
ray
by
ray-project
0.3%
39k
AI compute engine for scaling Python and AI applications
Starred by
+51
Created 9 years ago
Updated 14 hours ago
bert
by
google-research
0.1%
40k
TensorFlow code and pre-trained models for BERT
Starred by
+25
Created 7 years ago
Updated 1 year ago
awesome-ai-residency
by
dangkhoasdc
0.1%
3k
Curated list of AI residency programs
Starred by
Created 7 years ago
Updated 5 months ago
3D-Machine-Learning
by
timzhang642
0.1%
10k
Resource list for 3D machine learning
Starred by
+3
Created 8 years ago
Updated 1 year ago
tensor2tensor
by
tensorflow
0.1%
16k
Deprecated library for deep learning models/datasets, successor to Trax
Starred by
+23
Created 8 years ago
Updated 2 years ago
generating-reviews-discovering-sentiment
by
openai
0%
2k
Language model code for generating reviews and discovering sentiment
Starred by
+8
Created 8 years ago
Updated 2 years ago
tensorflow
by
tensorflow
0.1%
192k
Open-source ML framework
Starred by
+95
Created 10 years ago
Updated 14 hours ago
Feedback? Help us improve.