shizhediao avatar

Shizhe Diao

@shizhediao

Research Scientist at NVIDIA; Author of LMFlow

GitHubView on GitHub

Starred Projects (312)

Starred by Tri Dao Tri Dao(Chief Scientist at Together AI) and Albert Gu Albert Gu(Cofounder of Cartesia; Professor at CMU).

hnet by goombalab

2.7%
654
Hierarchical sequence modeling with dynamic chunking
created 1 month ago
updated 2 weeks ago
Starred by Jeffrey Morgan Jeffrey Morgan(Cofounder of Ollama), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

codex by openai

6.9%
35k
Coding agent CLI tool for terminal-based chat-driven development
created 4 months ago
updated 17 hours ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
2 more.

SWE-Gym by SWE-Gym

0.8%
522
Environment for training software engineering agents
created 9 months ago
updated 2 weeks ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Clement Delangue Clement Delangue(Cofounder of Hugging Face), and
41 more.

vllm by vllm-project

1.4%
55k
LLM serving engine for high-throughput, memory-efficient inference
created 2 years ago
updated 17 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
15 more.

open-r1 by huggingface

0.3%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 4 days ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
8 more.

verl by volcengine

2.2%
12k
RL training library for LLMs
created 9 months ago
updated 1 day ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
8 more.

TinyZero by Jiayi-Pan

0.2%
12k
Minimal reproduction of DeepSeek R1 Zero for countdown/multiplication tasks
created 6 months ago
updated 3 months ago
Starred by Michael Han Michael Han(Cofounder of Unsloth), Sebastian Raschka Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), and
10 more.

DeepSeek-R1 by deepseek-ai

0.1%
91k
Reasoning models research paper
created 6 months ago
updated 1 month ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory) and Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI).

VILA by NVlabs

0.5%
3k
Open-source VLMs for efficient video/multi-image understanding
created 1 year ago
updated 1 week ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
4 more.

awesome-o1 by srush

0%
1k
Bibliography for OpenAI's o1 project
created 10 months ago
updated 9 months ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Simon Mo Simon Mo(Core Maintainer of vLLM), and
4 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 10 months ago
updated 4 weeks ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

BERTopic by MaartenGr

0.3%
7k
Topic modeling with transformers and c-TF-IDF
created 4 years ago
updated 1 week ago
Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Jiaming Song Jiaming Song(Chief Scientist at Luma AI), and
1 more.

human-eval by openai

0.5%
3k
Evaluation harness for LLMs trained on code
created 4 years ago
updated 7 months ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Didier Lopes Didier Lopes(Founder of OpenBB), and
23 more.

llm.c by karpathy

0.2%
27k
LLM training in pure C/CUDA, no PyTorch needed
created 1 year ago
updated 1 month ago
Starred by Dan Guido Dan Guido(Cofounder of Trail of Bits), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
4 more.

PurpleLlama by meta-llama

0.6%
4k
LLM security toolkit for assessing/improving generative AI models
created 1 year ago
updated 2 days ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
5 more.

MiniCPM-o by OpenBMB

0.4%
20k
MLLM for vision, speech, and multimodal live streaming on your phone
created 1 year ago
updated 4 days ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Zhiyuan Li Zhiyuan Li(Cofounder of Nexa AI), and
18 more.

mamba by state-spaces

0.3%
16k
Mamba SSM architecture for sequence modeling
created 1 year ago
updated 4 weeks ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
25 more.

unsloth by unslothai

1.2%
44k
Finetuning tool for LLMs, targeting speed and memory efficiency
created 1 year ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
6 more.

RouteLLM by lm-sys

0.7%
4k
Framework for LLM routing and cost reduction (research paper)
created 1 year ago
updated 1 year ago
Starred by Georgi Gerganov Georgi Gerganov(Author of llama.cpp, whisper.cpp), Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), and
10 more.

Qwen3 by QwenLM

0.9%
24k
Large language model series by Qwen team, Alibaba Cloud
created 1 year ago
updated 1 week ago
Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Patrick von Platen Patrick von Platen(Research Engineer at Mistral; Author of Hugging Face Diffusers), and
10 more.

torchtune by pytorch

0.5%
5k
PyTorch library for LLM post-training and experimentation
created 1 year ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), youkaichao youkaichao(Core Maintainer of vLLM), and
3 more.

tianshou by thu-ml

0.3%
9k
PyTorch RL library for algorithm development and application
created 7 years ago
updated 1 week ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Jianwei Yang Jianwei Yang(Research Scientist at Meta Superintelligence Lab), and
2 more.

unified-io-2 by allenai

0%
621
Unified-IO 2 code for training, inference, and demo
created 1 year ago
updated 1 year ago
Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
5 more.

weak-to-strong by openai

0%
3k
Weak-to-strong generalization research paper implementation
created 1 year ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n).

inbox_cleaner by isafulf

0%
448
Python script for Gmail inbox management
created 1 year ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Author of Django), and
7 more.

Yi by 01-ai

0.1%
8k
Open-source bilingual LLMs trained from scratch
created 1 year ago
updated 8 months ago
Starred by Wes McKinney Wes McKinney(Author of Pandas), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
12 more.

autogen by microsoft

0.6%
49k
Agentic framework for multi-agent AI applications
created 2 years ago
updated 4 days ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
2 more.

agents by aiwaves-cn

0.1%
6k
Open-source framework for self-evolving, data-centric autonomous language agents
created 2 years ago
updated 10 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
2 more.

LongLoRA by dvlab-research

0.1%
3k
LongLoRA: Efficient fine-tuning for long-context LLMs
created 1 year ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
16 more.

TinyLlama by jzhang38

0.1%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
10 more.

codellama by meta-llama

0.0%
16k
Inference code for CodeLlama models
created 2 years ago
updated 1 year ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

EasyLM by young-geng

0%
2k
LLM training/finetuning framework in JAX/Flax
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
1 more.

dolma by allenai

0.6%
1k
Toolkit for curating datasets for language model pre-training
created 2 years ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory).

FastEdit by hiyouga

0%
1k
Tool for fast edits to large language models
created 2 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
2 more.

automated-interpretability by openai

0%
1k
Code and datasets for automated interpretability research
created 2 years ago
updated 1 year ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

LLM-Blender by yuchenlin

0%
957
LLM ensembling framework using pairwise ranking and generative fusion
created 2 years ago
updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
12 more.

open_llama by openlm-research

0.0%
8k
Open-source reproduction of LLaMA models
created 2 years ago
updated 2 years ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

qlora by artidoro

0.1%
11k
Finetuning tool for quantized LLMs
created 2 years ago
updated 1 year ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
17 more.

StableLM by Stability-AI

0.0%
16k
Language models by Stability AI
created 2 years ago
updated 1 year ago
Starred by Sourabh Bajaj Sourabh Bajaj(Cofounder of Uplimit), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
2 more.

NextChat by ChatGPTNextWeb

0.2%
85k
AI assistant for web, iOS, MacOS, Android, Linux, and Windows
created 2 years ago
updated 6 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
54 more.

langchain by langchain-ai

0.4%
114k
Framework for building LLM-powered applications
created 2 years ago
updated 17 hours ago
Starred by Anastasios Angelopoulos Anastasios Angelopoulos(Cofounder of LMArena), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
26 more.

evals by openai

0.3%
17k
Framework for evaluating LLMs and LLM systems, plus benchmark registry
created 2 years ago
updated 8 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

ChatGLM-6B by zai-org

0.0%
41k
Bilingual dialogue language model for research
created 2 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), John Yang John Yang(Author of SWE-bench, SWE-agent), and
20 more.

stanford_alpaca by tatsu-lab

0.0%
30k
Instruction-following LLaMA model training and data generation
created 2 years ago
updated 1 year ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
4 more.

RL4LMs by allenai

0.2%
2k
RL library to fine-tune language models to human preferences
created 3 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
29 more.

llama_index by run-llama

0.3%
44k
Data framework for building LLM-powered agents
created 2 years ago
updated 20 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Patrick von Platen Patrick von Platen(Research Engineer at Mistral; Author of Hugging Face Diffusers), and
10 more.

LoRA by microsoft

0.4%
13k
PyTorch library for low-rank adaptation (LoRA) of LLMs
created 4 years ago
updated 8 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
created 3 years ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
4 more.

hh-rlhf by anthropics

0.2%
2k
RLHF dataset for training safe AI assistants
created 3 years ago
updated 1 month ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
8 more.

helm by stanford-crfm

0.5%
2k
Open-source Python framework for holistic evaluation of foundation models
created 3 years ago
updated 20 hours ago
Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
29 more.

whisper by openai

0.5%
87k
Speech recognition model for multilingual transcription/translation
created 2 years ago
updated 1 month ago
Starred by Dan Abramov Dan Abramov(Core Contributor to React), Patrick von Platen Patrick von Platen(Research Engineer at Mistral; Author of Hugging Face Diffusers), and
42 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
created 3 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
13 more.

alpa by alpa-projects

0.1%
3k
Auto-parallelization framework for large-scale neural network training and serving
created 4 years ago
updated 1 year ago
Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), and
1 more.

rome by kmeng01

0.1%
655
Model editing research paper for GPT-2 and GPT-J
created 3 years ago
updated 1 year ago
Starred by Ying Sheng Ying Sheng(Author of SGLang), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

adapters by adapter-hub

0.1%
3k
Unified library for parameter-efficient transfer learning in NLP
created 5 years ago
updated 1 day ago
Starred by Victor Taelin Victor Taelin(Author of Bend, Kind, HVM) and Jane Manchun Wong Jane Manchun Wong(Security Researcher; Tech Blogger).

dalle-2-preview by openai

0%
1k
created 3 years ago
updated 3 years ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
10 more.

vit-pytorch by lucidrains

0.3%
24k
PyTorch library for Vision Transformer variants and related techniques
created 4 years ago
updated 2 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
7 more.

taming-transformers by CompVis

0.2%
6k
Image synthesis research paper using transformers
created 4 years ago
updated 1 year ago
Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Omar Khattab Omar Khattab(Author of DSPy, ColBERT; Professor at MIT), and
11 more.

gpt-neo by EleutherAI

0.0%
8k
GPT-2/3-style model implementation using mesh-tensorflow
created 5 years ago
updated 3 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
2 more.

lit by PAIR-code

0.0%
4k
Interactive ML model analysis tool for understanding model behavior
created 5 years ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Didier Lopes Didier Lopes(Founder of OpenBB), and
1 more.

qlib by microsoft

1.1%
28k
AI platform for quantitative investment research and production
created 5 years ago
updated 23 hours ago
Starred by Patrick von Platen Patrick von Platen(Research Engineer at Mistral; Author of Hugging Face Diffusers), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
2 more.

fastformers by microsoft

0%
707
NLU optimization recipes for transformer models
created 5 years ago
updated 4 months ago
Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Travis Fischer Travis Fischer(Founder of Agentic), and
12 more.

prophet by facebook

0.1%
20k
Forecasting tool for time series data
created 8 years ago
updated 3 weeks ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
19 more.

datasets by huggingface

0.2%
21k
Access and process large AI datasets efficiently
created 5 years ago
updated 3 days ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Didier Lopes Didier Lopes(Founder of OpenBB), and
16 more.

sentence-transformers by UKPLab

0.3%
17k
Framework for text embeddings, retrieval, and reranking
created 6 years ago
updated 1 week ago
Starred by Andrew Kane Andrew Kane(Author of pgvector), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
9 more.

xlnet by zihangdai

0%
6k
Language model research paper using generalized autoregressive pretraining
created 6 years ago
updated 2 years ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), and
71 more.

tensorflow by tensorflow

0.1%
191k
Open-source ML framework
created 9 years ago
updated 17 hours ago
Feedback? Help us improve.