zhuohan123 avatar

Zhuohan Li

@zhuohan123

Author of vLLM

GitHubView on GitHub

Starred Projects (93)

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Ying Sheng Ying Sheng(Author of SGLang), and
1 more.

mirage by mirage-project

1.3%
2k
Tool for fast GPU kernel generation via superoptimization
created 1 year ago
updated 1 day ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Alexander Wettig Alexander Wettig(Author of SWE-bench, SWE-agent), and
5 more.

torchtitan by pytorch

0.9%
4k
PyTorch platform for generative AI model training research
created 1 year ago
updated 15 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
6 more.

ThunderKittens by HazyResearch

0.6%
3k
CUDA kernel framework for fast deep learning primitives
created 1 year ago
updated 3 days ago
Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Simon Willison Simon Willison(Author of Django), and
8 more.

simple-evals by openai

0.4%
4k
Lightweight library for evaluating language models
created 1 year ago
updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Lianmin Zheng Lianmin Zheng(Author of SGLang).

Consistency_LLM by hao-ai-lab

0%
397
Parallel decoder for efficient LLM inference
created 1 year ago
updated 8 months ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Fabian Hedin Fabian Hedin(Cofounder of Lovable), and
11 more.

grok-1 by xai-org

0.1%
50k
JAX example code for loading and running Grok-1 open-weights model
created 1 year ago
updated 11 months ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

kserve by kserve

0.5%
4k
Kubernetes CRD for scalable ML model serving
created 6 years ago
updated 2 days ago
Starred by Matei Zaharia Matei Zaharia(Cofounder of Databricks), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

LWM by LargeWorldModel

0.0%
7k
Multimodal autoregressive model for long-context video/text
created 1 year ago
updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Soumith Chintala Soumith Chintala(Author of PyTorch), and
4 more.

search_with_lepton by leptonai

0.1%
8k
Conversational search engine demo
created 1 year ago
updated 1 month ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
19 more.

llama_index by run-llama

0.3%
43k
Data framework for building LLM-powered agents
created 2 years ago
updated 12 hours ago
Starred by Travis Fischer Travis Fischer(Founder of Agentic), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
8 more.

LLaVA by haotian-liu

0.2%
23k
Multimodal assistant with GPT-4 level capabilities
created 2 years ago
updated 11 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lianmin Zheng Lianmin Zheng(Author of SGLang), and
12 more.

gpt-fast by pytorch-labs

0.1%
6k
PyTorch text generation for efficient transformer inference
created 1 year ago
updated 3 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
12 more.

axolotl by axolotl-ai-cloud

0.6%
10k
CLI tool for streamlined post-training of AI models
created 2 years ago
updated 18 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
4 more.

TensorRT-LLM by NVIDIA

0.6%
11k
LLM inference optimization SDK for NVIDIA GPUs
created 1 year ago
updated 11 hours ago
Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm) and Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

streaming-llm by mit-han-lab

0.1%
7k
Framework for efficient LLM streaming
created 1 year ago
updated 1 year ago
Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

Medusa by FasterDecoding

0.2%
3k
Framework for accelerating LLM generation using multiple decoding heads
created 1 year ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
3 more.

llm-awq by mit-han-lab

0.4%
3k
Weight quantization research paper for LLM compression/acceleration
created 2 years ago
updated 2 weeks ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
15 more.

flash-attention by Dao-AILab

0.7%
19k
Fast, memory-efficient attention implementation
created 3 years ago
updated 11 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tobi Lutke Tobi Lutke(Cofounder of Shopify), and
26 more.

vllm by vllm-project

1.0%
54k
LLM serving engine for high-throughput, memory-efficient inference
created 2 years ago
updated 7 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
31 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 7 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
6 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
5 more.

EasyLM by young-geng

0.2%
2k
LLM training/finetuning framework in JAX/Flax
created 2 years ago
updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
9 more.

open_llama by openlm-research

0.0%
8k
Open-source reproduction of LLaMA models
created 2 years ago
updated 2 years ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
37 more.

langchain by langchain-ai

0.4%
113k
Framework for building LLM-powered applications
created 2 years ago
updated 18 hours ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
5 more.

FasterTransformer by NVIDIA

0.2%
6k
Optimized transformer library for inference
created 4 years ago
updated 1 year ago
Starred by Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
3 more.

PiPPy by pytorch

0.1%
775
PyTorch tool for pipeline parallelism
created 3 years ago
updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
10 more.

alpa by alpa-projects

0.1%
3k
Auto-parallelization framework for large-scale neural network training and serving
created 4 years ago
updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
11 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 18 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI), and
15 more.

pytorch-lightning by Lightning-AI

0.1%
30k
Deep learning framework for pretraining, finetuning, and deploying AI models
created 6 years ago
updated 2 days ago
Starred by Hiroshi Shibata Hiroshi Shibata(Core Contributor to Ruby), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
26 more.

ray by ray-project

0.3%
38k
AI compute engine for scaling Python and AI applications
created 8 years ago
updated 9 hours ago
Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google), Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), and
44 more.

tensorflow by tensorflow

0.1%
191k
Open-source ML framework
created 9 years ago
updated 6 hours ago
Feedback? Help us improve.