X
2,142
Home
Browse all repos
/
Discover and explore top open-source AI tools and projects—updated daily.
Home
Browse all repos
Home
>
Users
>
Woosuk Kwon
Woosuk Kwon
Coauthor of vLLM
GitHub
Starred Projects (56)
batch_invariant_ops
by
thinking-machines-lab
65.1%
636
Enhance LLM inference determinism
Starred by
+1
Created 1 week ago
Updated 1 week ago
openevolve
by
codelion
1.3%
4k
Coding agent for scientific/algorithmic discovery, based on AlphaEvolve paper
Starred by
Created 4 months ago
Updated 23 hours ago
nano-vllm
by
GeeeekExplorer
2.6%
7k
Lightweight vLLM implementation from scratch
Starred by
+1
Created 3 months ago
Updated 2 weeks ago
llm-d
by
llm-d
1.5%
2k
Kubernetes-native framework for distributed LLM inference
Starred by
Created 4 months ago
Updated 1 day ago
dynamo
by
ai-dynamo
1.0%
5k
Inference framework for distributed generative AI model serving
Starred by
+6
Created 6 months ago
Updated 15 hours ago
MiMo
by
XiaomiMiMo
0.1%
2k
LLM for reasoning, pre-trained and post-trained for math/code tasks
Starred by
Created 4 months ago
Updated 3 months ago
rllm
by
rllm-org
1.6%
4k
Framework for post-training language agents via reinforcement learning
Starred by
+2
Created 7 months ago
Updated 14 hours ago
chatgpt_system_prompt
by
LouisShark
0.3%
10k
GPT system prompt collection for prompt engineering and security education
Starred by
+2
Created 1 year ago
Updated 2 days ago
fairseq2
by
facebookresearch
0.2%
1k
Sequence modeling toolkit for content generation research
Created 2 years ago
Updated 1 day ago
Mooncake
by
kvcache-ai
1.3%
4k
Research paper on a disaggregated architecture for LLM serving
Starred by
+2
Created 1 year ago
Updated 16 hours ago
xgrammar
by
mlc-ai
2.4%
1k
Library for efficient structured generation
Starred by
+4
Created 1 year ago
Updated 22 hours ago
Liger-Kernel
by
linkedin
0.6%
6k
Triton kernels for efficient LLM training
Starred by
+8
Created 1 year ago
Updated 1 day ago
Nanoflow
by
efeslab
0.5%
891
LLM serving framework for high throughput
Starred by
Created 1 year ago
Updated 1 day ago
xla
by
pytorch
0.3%
3k
PyTorch on XLA devices
Starred by
+15
Created 6 years ago
Updated 1 day ago
ao
by
pytorch
0.6%
2k
PyTorch library for quantization and sparsity in training/inference
Starred by
+9
Created 1 year ago
Updated 22 hours ago
TensorRT-Model-Optimizer
by
NVIDIA
2.4%
1k
Library for optimizing deep learning models for GPU inference
Starred by
Created 1 year ago
Updated 18 hours ago
llm-compressor
by
vllm-project
1.4%
2k
Transformers-compatible library for LLM compression, optimized for vLLM deployment
Starred by
Created 1 year ago
Updated 1 day ago
intel-extension-for-pytorch
by
intel
0.3%
2k
PyTorch extension for performance boost on Intel platforms
Starred by
Created 5 years ago
Updated 3 days ago
ThunderKittens
by
HazyResearch
0.6%
3k
CUDA kernel framework for fast deep learning primitives
Starred by
+12
Created 1 year ago
Updated 2 days ago
mirage
by
mirage-project
2.2%
2k
Tool for fast GPU kernel generation via superoptimization
Starred by
+1
Created 1 year ago
Updated 1 day ago
AutoAWQ
by
casper-hansen
0.2%
2k
AutoAWQ is a tool for 4-bit quantized LLM inference
Starred by
+4
Created 2 years ago
Updated 4 months ago
grok-1
by
xai-org
0.1%
51k
JAX example code for loading and running Grok-1 open-weights model
Starred by
+22
Created 1 year ago
Updated 1 year ago
lm-evaluation-harness
by
EleutherAI
0.7%
10k
Framework for few-shot language model evaluation
Starred by
+16
Created 5 years ago
Updated 2 days ago
LLMSys-PaperList
by
AmberLJC
0.6%
2k
Curated list of LLM systems papers
Starred by
Created 2 years ago
Updated 2 weeks ago
aici
by
microsoft
0.1%
2k
AICI constrains LLM output using (Wasm) programs
Starred by
+7
Created 2 years ago
Updated 7 months ago
mlc-llm
by
mlc-ai
0.3%
21k
Universal LLM deployment engine with ML compilation
Starred by
+21
Created 2 years ago
Updated 1 day ago
mscclpp
by
microsoft
0.7%
417
GPU-driven communication stack for scalable AI applications
Starred by
Created 2 years ago
Updated 18 hours ago
sglang
by
sgl-project
1.2%
18k
Fast serving framework for LLMs and vision language models
Starred by
+32
Created 1 year ago
Updated 14 hours ago
flashinfer
by
flashinfer-ai
1.0%
4k
Kernel library for LLM serving
Starred by
+10
Created 2 years ago
Updated 23 hours ago
punica
by
punica-ai
0.2%
1k
LoRA serving system (research paper) for multi-tenant LLM inference
Starred by
+2
Created 2 years ago
Updated 1 year ago
LLMCompiler
by
SqueezeAILab
0.1%
2k
LLM compiler for parallel function calling
Starred by
+2
Created 1 year ago
Updated 1 year ago
gpt-fast
by
meta-pytorch
0.2%
6k
PyTorch text generation for efficient transformer inference
Starred by
+20
Created 1 year ago
Updated 3 weeks ago
TensorRT-LLM
by
NVIDIA
0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Starred by
+17
Created 2 years ago
Updated 14 hours ago
WizardLM
by
nlpxucan
0.0%
9k
LLMs built using Evol-Instruct for complex instruction following
Starred by
+15
Created 2 years ago
Updated 3 months ago
outlines
by
dottxt-ai
0.3%
13k
SDK for structured LLM text generation
Starred by
+34
Created 2 years ago
Updated 16 hours ago
Awesome-LLM
by
Hannibal046
0.3%
25k
Curated list of Large Language Model resources
Starred by
+8
Created 2 years ago
Updated 1 month ago
gorilla
by
ShishirPatil
0.2%
12k
LLM tool-use framework for API invocation and function calling
Starred by
+15
Created 2 years ago
Updated 22 hours ago
LLMSurvey
by
RUCAIBox
0.2%
12k
Survey paper for large language models
Starred by
+2
Created 2 years ago
Updated 6 months ago
CTranslate2
by
OpenNMT
0.3%
4k
Fast inference engine for Transformer models
Starred by
+6
Created 6 years ago
Updated 5 months ago
SqueezeLLM
by
SqueezeAILab
0%
703
Quantization framework for efficient LLM serving (ICML 2024 paper)
Starred by
Created 2 years ago
Updated 1 year ago
vllm
by
vllm-project
1.1%
58k
LLM serving engine for high-throughput, memory-efficient inference
Starred by
+55
Created 2 years ago
Updated 14 hours ago
Awesome-LLMOps
by
tensorchord
0.4%
5k
Curated list of LLMOps tools for developers
Starred by
+3
Created 3 years ago
Updated 1 month ago
FastChat
by
lm-sys
0.1%
39k
Open platform for training, serving, and evaluating LLM-based chatbots
Starred by
+35
Created 2 years ago
Updated 3 months ago
llama
by
meta-llama
0.1%
59k
Inference code for Llama 2 models (deprecated)
Starred by
+37
Created 2 years ago
Updated 7 months ago
Megatron-LM
by
NVIDIA
0.5%
14k
Framework for training transformer models at scale
Starred by
+18
Created 6 years ago
Updated 15 hours ago
flash-attention
by
Dao-AILab
0.6%
20k
Fast, memory-efficient attention implementation
Starred by
+31
Created 3 years ago
Updated 1 day ago
TransformerEngine
by
NVIDIA
0.4%
3k
Library for Transformer model acceleration on NVIDIA GPUs
Starred by
+4
Created 3 years ago
Updated 20 hours ago
x-transformers
by
lucidrains
0.2%
6k
Transformer library with extensive experimental features
Starred by
+7
Created 4 years ago
Updated 5 days ago
compiler-and-arch
by
KnowingNothing
0%
486
Compiler/architecture resources for emerging domains
Starred by
Created 3 years ago
Updated 8 months ago
skypilot
by
skypilot-org
0.5%
9k
Framework for cloud AI/batch jobs, unifying execution across diverse infrastructure
Starred by
+24
Created 4 years ago
Updated 16 hours ago
FasterTransformer
by
NVIDIA
0.1%
6k
Optimized transformer library for inference
Starred by
+12
Created 4 years ago
Updated 1 year ago
alpa
by
alpa-projects
0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Starred by
+17
Created 4 years ago
Updated 1 year ago
transformers
by
huggingface
0.3%
150k
ML library for pretrained model inference and training
Starred by
+94
Created 6 years ago
Updated 14 hours ago
ray
by
ray-project
0.3%
39k
AI compute engine for scaling Python and AI applications
Starred by
+51
Created 9 years ago
Updated 14 hours ago
tvm
by
apache
0.3%
13k
Compiler stack for deep learning systems
Starred by
+19
Created 9 years ago
Updated 2 days ago
DeepLearningExamples
by
NVIDIA
0.1%
14k
Deep learning examples for training and deployment
Starred by
+8
Created 7 years ago
Updated 1 year ago
Feedback? Help us improve.