X
2,142
Home
Browse all repos
/
Discover and explore top open-source AI tools and projects—updated daily.
Home
Browse all repos
Home
>
Users
>
Ying Sheng
Ying Sheng
Coauthor of SGLang
GitHub
Starred Projects (122)
SpecForge
by
sgl-project
7.9%
396
Train speculative decoding models for faster inference
Starred by
Created 3 months ago
Updated 2 days ago
ome
by
sgl-project
2.6%
271
Kubernetes operator for LLM serving
Starred by
Created 4 months ago
Updated 15 hours ago
ChatLearn
by
alibaba
0.9%
425
Training framework for large-scale alignment tasks
Created 2 years ago
Updated 12 hours ago
OpenRLHF
by
OpenRLHF
0.7%
8k
RLHF framework for scalable training of large language models
Starred by
+8
Created 2 years ago
Updated 3 days ago
verl
by
volcengine
1.9%
13k
RL training library for LLMs
Starred by
+13
Created 10 months ago
Updated 19 hours ago
how-to-optim-algorithm-in-cuda
by
BBuf
0.9%
2k
CUDA optimization guide for common algorithms
Created 7 years ago
Updated 3 days ago
sgl-learning-materials
by
sgl-project
1.4%
578
Learning materials for SGLang, an efficient LLM serving engine
Starred by
Created 1 year ago
Updated 2 weeks ago
glake
by
antgroup
0.2%
480
GPU optimization library for memory management and IO
Created 2 years ago
Updated 5 months ago
xgrammar
by
mlc-ai
2.4%
1k
Library for efficient structured generation
Starred by
+4
Created 1 year ago
Updated 20 hours ago
NeMo
by
NVIDIA-NeMo
0.3%
16k
Scalable generative AI framework for LLMs, multimodal, and speech AI research
Starred by
+14
Created 6 years ago
Updated 11 hours ago
Nanoflow
by
efeslab
0.5%
891
LLM serving framework for high throughput
Starred by
Created 1 year ago
Updated 1 day ago
InfiniteBench
by
OpenBMB
0.6%
347
Benchmark for evaluating language models on super-long contexts (100k+ tokens)
Starred by
Created 1 year ago
Updated 11 months ago
simple-evals
by
openai
0.3%
4k
Lightweight library for evaluating language models
Starred by
+13
Created 1 year ago
Updated 1 month ago
ScaleLLM
by
vectorch-ai
0.4%
466
LLM inference system for production environments
Created 2 years ago
Updated 3 days ago
mergekit
by
arcee-ai
0.4%
6k
CLI tool for merging pretrained language models, combining strengths without retraining
Starred by
+14
Created 2 years ago
Updated 22 hours ago
RouteLLM
by
lm-sys
0.3%
4k
Framework for LLM routing and cost reduction (research paper)
Starred by
+6
Created 1 year ago
Updated 1 year ago
GPTQModel
by
ModelCloud
1.3%
784
LLM compression toolkit for accelerated CPU/GPU inference
Starred by
Created 1 year ago
Updated 12 hours ago
Mooncake
by
kvcache-ai
1.3%
4k
Research paper on a disaggregated architecture for LLM serving
Starred by
+2
Created 1 year ago
Updated 13 hours ago
SWE-bench
by
SWE-bench
2.3%
4k
Benchmark for evaluating LLMs on real-world GitHub issues
Starred by
+11
Created 1 year ago
Updated 17 hours ago
inspect_ai
by
UKGovernmentBEIS
1.1%
1k
Framework for large language model evaluations
Starred by
+5
Created 1 year ago
Updated 23 hours ago
Quest
by
mit-han-lab
0%
333
Inference framework for efficient long-context LLM inference
Created 1 year ago
Updated 2 months ago
dspy
by
stanfordnlp
0.8%
28k
Framework for programming language models, not prompting
Starred by
+49
Created 2 years ago
Updated 12 hours ago
DoRA
by
NVlabs
0.3%
854
PyTorch code for weight-decomposed low-rank adaptation (DoRA)
Starred by
Created 1 year ago
Updated 11 months ago
RULER
by
NVIDIA
0.8%
1k
Evaluation suite for long-context language models research paper
Starred by
Created 1 year ago
Updated 1 month ago
mirage
by
mirage-project
2.2%
2k
Tool for fast GPU kernel generation via superoptimization
Starred by
+1
Created 1 year ago
Updated 1 day ago
beir
by
beir-cellar
0.2%
2k
IR benchmark for evaluating NLP retrieval models
Starred by
+1
Created 4 years ago
Updated 3 months ago
storm
by
stanford-oval
0.2%
27k
LLM system for automated knowledge curation and article generation
Starred by
+5
Created 1 year ago
Updated 2 months ago
llm.c
by
karpathy
0.2%
28k
LLM training in pure C/CUDA, no PyTorch needed
Starred by
+26
Created 1 year ago
Updated 2 months ago
LeetCUDA
by
xlite-dev
6.7%
7k
CUDA learning notes for beginners using PyTorch
Created 2 years ago
Updated 3 days ago
candle
by
huggingface
0.5%
18k
Minimalist ML framework for Rust, emphasizing performance and ease of use
Starred by
+22
Created 2 years ago
Updated 3 days ago
Open-Sora-Plan
by
PKU-YuanGroup
0.1%
12k
Open-source project aiming to reproduce Sora-like T2V model
Starred by
+2
Created 1 year ago
Updated 2 months ago
MiniGPT4-video
by
Vision-CAIR
0.2%
628
Video-language model for short and long video understanding
Created 1 year ago
Updated 9 months ago
adapters
by
adapter-hub
0.2%
3k
Unified library for parameter-efficient transfer learning in NLP
Starred by
+7
Created 5 years ago
Updated 1 month ago
lmdeploy
by
InternLM
0.7%
7k
Toolkit for LLM compression, deployment, and serving
Starred by
+8
Created 2 years ago
Updated 13 hours ago
guidance
by
guidance-ai
0.1%
21k
Guidance is a programming paradigm for steering LLMs
Starred by
+38
Created 2 years ago
Updated 23 hours ago
trl
by
huggingface
0.6%
16k
Library for transformer RL
Starred by
+28
Created 5 years ago
Updated 12 hours ago
sglang
by
sgl-project
1.2%
18k
Fast serving framework for LLMs and vision language models
Starred by
+32
Created 1 year ago
Updated 11 hours ago
MultiPL-E
by
nuprl
0.4%
270
Benchmark for evaluating code generation LLMs across multiple programming languages
Created 3 years ago
Updated 1 month ago
rags
by
run-llama
0.0%
7k
Streamlit app for building RAG pipelines via natural language
Starred by
Created 1 year ago
Updated 1 year ago
llm-reasoners
by
maitrix-org
0.3%
2k
Library for advanced LLM reasoning with search algorithms
Starred by
Created 2 years ago
Updated 3 months ago
ToolBench
by
OpenBMB
0.2%
5k
Open platform for LLM tool learning (ICLR'24 spotlight)
Starred by
+6
Created 2 years ago
Updated 4 months ago
autogen
by
microsoft
0.5%
50k
Agentic framework for multi-agent AI applications
Starred by
+19
Created 2 years ago
Updated 17 hours ago
webarena
by
web-arena-x
1.0%
1k
Web environment for autonomous agent development
Starred by
Created 2 years ago
Updated 1 week ago
WebGLM
by
THUDM
0%
2k
Web-enhanced question answering system using a 10B GLM
Created 2 years ago
Updated 5 months ago
DeepSpeed-MII
by
deepspeedai
0.1%
2k
Python library for high-throughput, low-latency, and cost-effective model inference
Starred by
+5
Created 3 years ago
Updated 2 months ago
megablocks
by
databricks
0.6%
1k
Lightweight library for mixture-of-experts (MoE) training
Starred by
+15
Created 2 years ago
Updated 2 months ago
EAGLE
by
SafeAILab
10.6%
2k
Speculative decoding research paper for faster LLM inference
Starred by
+5
Created 1 year ago
Updated 1 week ago
gpt-fast
by
meta-pytorch
0.2%
6k
PyTorch text generation for efficient transformer inference
Starred by
+20
Created 1 year ago
Updated 3 weeks ago
VLM_survey
by
jingyi0000
0.2%
3k
VLM survey paper with links to models/methods for vision tasks
Created 2 years ago
Updated 3 months ago
dify
by
langgenius
0.5%
114k
Open-source LLM app development platform
Starred by
+17
Created 2 years ago
Updated 11 hours ago
LookaheadDecoding
by
hao-ai-lab
0.2%
1k
Parallel decoding algorithm for faster LLM inference
Starred by
Created 1 year ago
Updated 6 months ago
ARES
by
stanford-futuredata
0.6%
656
RAG evaluation framework
Starred by
Created 2 years ago
Updated 5 months ago
llm-analysis
by
cli99
0.4%
455
CLI tool for LLM latency/memory analysis during training/inference
Starred by
Created 2 years ago
Updated 5 months ago
Awesome-LLM-Inference
by
xlite-dev
0.9%
5k
Curated list of LLM/VLM inference research papers with code
Created 2 years ago
Updated 1 month ago
gpt_paper_assistant
by
tatsu-lab
0%
536
ArXiv scanner using GPT-4 for personalized paper recommendations
Starred by
Created 1 year ago
Updated 1 year ago
MergeLM
by
yule-BUAA
0.1%
850
Codebase for merging language models via parameter averaging
Starred by
Created 1 year ago
Updated 1 year ago
ChatRTX
by
NVIDIA
0.3%
3k
Demo app for local RAG chatbot on Windows
Starred by
Created 1 year ago
Updated 5 months ago
DeepSeek-Coder
by
deepseek-ai
0.1%
22k
Code LLM for code completion and generation
Starred by
+4
Created 1 year ago
Updated 1 year ago
llm-decontaminator
by
lm-sys
0.3%
310
LLM contamination detector for quantifying rephrased samples
Starred by
Created 1 year ago
Updated 1 year ago
S-LoRA
by
S-LoRA
0.2%
2k
System for scalable LoRA adapter serving
Starred by
+1
Created 1 year ago
Updated 1 year ago
flashinfer
by
flashinfer-ai
1.0%
4k
Kernel library for LLM serving
Starred by
+10
Created 2 years ago
Updated 20 hours ago
skypilot
by
skypilot-org
0.5%
9k
Framework for cloud AI/batch jobs, unifying execution across diverse infrastructure
Starred by
+24
Created 4 years ago
Updated 13 hours ago
gpu_poor
by
RahulSChand
0.2%
1k
CLI tool for LLM memory and throughput estimation
Created 2 years ago
Updated 9 months ago
ggml
by
ggml-org
0.3%
13k
Tensor library for machine learning
Starred by
+16
Created 3 years ago
Updated 2 days ago
tensorrtllm_backend
by
triton-inference-server
0.2%
889
Triton backend for serving TensorRT-LLM models
Starred by
Created 2 years ago
Updated 1 day ago
TensorRT-LLM
by
NVIDIA
0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Starred by
+17
Created 2 years ago
Updated 11 hours ago
leptonai
by
leptonai
0.1%
3k
Python framework for simplifying AI service building
Starred by
+1
Created 2 years ago
Updated 1 day ago
punica
by
punica-ai
0.2%
1k
LoRA serving system (research paper) for multi-tenant LLM inference
Starred by
+2
Created 2 years ago
Updated 1 year ago
modular
by
modular
0.1%
25k
AI toolchain unifying fragmented AI deployment workflows
Starred by
+8
Created 2 years ago
Updated 1 day ago
llm-numbers
by
ray-project
0%
4k
LLM developer's reference for key numbers
Starred by
+8
Created 2 years ago
Updated 1 year ago
LLaMA2-Accessory
by
Alpha-VLLM
0%
3k
Open-source toolkit for LLM development, pretraining, finetuning, and deployment
Starred by
+1
Created 2 years ago
Updated 8 months ago
rerope
by
bojone
0%
381
Position embeddings research paper
Starred by
Created 2 years ago
Updated 1 year ago
LightLLM
by
ModelTC
0.5%
4k
Python framework for LLM inference and serving
Starred by
+6
Created 2 years ago
Updated 11 hours ago
GPTCache
by
zilliztech
0.2%
8k
Semantic cache for LLM queries, integrated with LangChain and LlamaIndex
Starred by
+7
Created 2 years ago
Updated 2 months ago
llama
by
meta-llama
0.1%
59k
Inference code for Llama 2 models (deprecated)
Starred by
+37
Created 2 years ago
Updated 7 months ago
LLM-Training-Puzzles
by
srush
0.5%
1k
Hands-on puzzles for large language model training
Starred by
+8
Created 2 years ago
Updated 1 year ago
ringattention
by
haoliuhl
0.5%
740
Jax implementation of RingAttention for large context models (research paper)
Starred by
Created 2 years ago
Updated 7 months ago
metal-flash-attention
by
philipturner
1.1%
533
Metal port of FlashAttention for Apple silicon
Starred by
+2
Created 2 years ago
Updated 1 year ago
companion-app
by
a16z-infra
0.1%
6k
AI companion stack for personalized chatbots
Starred by
+6
Created 2 years ago
Updated 1 year ago
bitsandbytes
by
bitsandbytes-foundation
0.3%
8k
PyTorch library for k-bit quantization, enabling accessible LLMs
Starred by
+26
Created 4 years ago
Updated 2 days ago
long_llama
by
CStanKonrad
0%
1k
LLM for long context handling, fine-tuned with Focused Transformer
Starred by
Created 2 years ago
Updated 1 year ago
scalene
by
plasma-umass
0.2%
13k
Python profiler with AI-powered optimization proposals
Starred by
+14
Created 5 years ago
Updated 1 week ago
LLMSurvey
by
RUCAIBox
0.2%
12k
Survey paper for large language models
Starred by
+2
Created 2 years ago
Updated 6 months ago
Awesome-LLM-Compression
by
HuangOwen
0.3%
2k
LLM compression papers and tools for efficient training/inference
Created 2 years ago
Updated 2 months ago
fastllm
by
ztxz16
0.4%
4k
High-performance C++ LLM inference library
Starred by
Created 2 years ago
Updated 1 week ago
H2O
by
FMInference
0.2%
473
KV cache eviction research paper for efficient LLM inference
Starred by
Created 2 years ago
Updated 1 year ago
LongChat
by
DachengLi1
0.2%
533
Long-context LLM chatbot training and evaluation framework
Starred by
+2
Created 2 years ago
Updated 1 year ago
xgen
by
salesforce
0.1%
723
LLM research release with 8k sequence length
Starred by
Created 2 years ago
Updated 7 months ago
peft
by
huggingface
0.3%
20k
Parameter-efficient fine-tuning (PEFT) library
Starred by
+16
Created 2 years ago
Updated 2 days ago
AutoGPT
by
Significant-Gravitas
0.1%
179k
AI agent platform for building, deploying, and running autonomous workflows
Starred by
+55
Created 2 years ago
Updated 11 hours ago
CTranslate2
by
OpenNMT
0.3%
4k
Fast inference engine for Transformer models
Starred by
+6
Created 6 years ago
Updated 5 months ago
text-generation-inference
by
huggingface
0.2%
11k
Rust/Python/gRPC server for fast LLM text generation
Starred by
+34
Created 2 years ago
Updated 1 day ago
mlc-llm
by
mlc-ai
0.3%
21k
Universal LLM deployment engine with ML compilation
Starred by
+21
Created 2 years ago
Updated 1 day ago
awesome-mixture-of-experts
by
XueFuzhao
0.3%
1k
Curated list of resources for mixture-of-experts (MoE) research
Created 3 years ago
Updated 9 months ago
InternLM-techreport
by
InternLM
0%
906
Multilingual LLM research paper with 104B parameters
Starred by
Created 2 years ago
Updated 2 years ago
awesome-chatgpt-dataset
by
voidful
0.1%
737
Dataset repo for LLM training
Starred by
Created 2 years ago
Updated 1 year ago
RWKV-LM
by
BlinkDL
0.2%
14k
RNN for LLM, transformer-level performance, parallelizable training
Starred by
+28
Created 4 years ago
Updated 3 days ago
qdrant
by
qdrant
0.8%
26k
Vector database for similarity search in AI applications
Starred by
+22
Created 5 years ago
Updated 1 day ago
alpaca-lora
by
tloen
0.0%
19k
LoRA fine-tuning for LLaMA
Starred by
+22
Created 2 years ago
Updated 1 year ago
trlx
by
CarperAI
0.0%
5k
Distributed RLHF for LLMs
Starred by
+16
Created 3 years ago
Updated 1 year ago
MOSS
by
OpenMOSS
0.0%
12k
Open-source tool-augmented conversational language model
Starred by
+2
Created 2 years ago
Updated 1 year ago
CodeGeeX
by
zai-org
0.1%
9k
Code generation model for multilingual programming
Created 3 years ago
Updated 1 year ago
baize-chatbot
by
project-baize
0%
3k
Chat model trained via LoRA, using ChatGPT-generated dialogs
Starred by
+3
Created 2 years ago
Updated 1 year ago
ChatGDB
by
pgosar
0%
927
CLI tool for debugging with natural language via LLM
Starred by
Created 2 years ago
Updated 9 months ago
FastChat
by
lm-sys
0.1%
39k
Open platform for training, serving, and evaluating LLM-based chatbots
Starred by
+35
Created 2 years ago
Updated 3 months ago
fastertransformer_backend
by
triton-inference-server
0%
412
Triton backend for optimized transformer inference
Created 4 years ago
Updated 1 year ago
LMFlow
by
OptimalScale
0.0%
8k
Toolkit for finetuning and inference of large foundation models
Starred by
+9
Created 2 years ago
Updated 1 month ago
langchain
by
langchain-ai
0.4%
116k
Framework for building LLM-powered applications
Starred by
+80
Created 2 years ago
Updated 1 day ago
llama_index
by
run-llama
0.3%
44k
Data framework for building LLM-powered agents
Starred by
+41
Created 2 years ago
Updated 17 hours ago
FlexLLMGen
by
FMInference
0.1%
9k
High-throughput generation engine for LLMs with limited GPU memory
Starred by
+20
Created 2 years ago
Updated 10 months ago
PiPPy
by
pytorch
0%
779
PyTorch tool for pipeline parallelism
Starred by
+3
Created 3 years ago
Updated 1 year ago
best_AI_papers_2022
by
louisfb01
0%
3k
AI paper list (2022) with video explanations and code
Starred by
Created 3 years ago
Updated 1 year ago
dpm-solver
by
LuChengTHU
0.2%
2k
Fast ODE solver for diffusion probabilistic model sampling
Starred by
Created 3 years ago
Updated 1 year ago
FasterTransformer
by
NVIDIA
0.1%
6k
Optimized transformer library for inference
Starred by
+12
Created 4 years ago
Updated 1 year ago
CodeGen
by
salesforce
0%
5k
Open-source model family for program synthesis
Starred by
+8
Created 3 years ago
Updated 7 months ago
GLM-130B
by
zai-org
0.0%
8k
Bilingual model for research and evaluation
Starred by
+6
Created 3 years ago
Updated 2 years ago
Megatron-LM
by
NVIDIA
0.5%
14k
Framework for training transformer models at scale
Starred by
+18
Created 6 years ago
Updated 12 hours ago
llm-seminar
by
craffel
0%
311
Course reading list for large language models
Starred by
Created 3 years ago
Updated 2 years ago
paper-reading
by
mli
0.2%
31k
Deep learning paper readings
Starred by
+2
Created 3 years ago
Updated 6 months ago
CodeT5
by
salesforce
0.2%
3k
Code LLMs for code understanding and generation research
Starred by
+3
Created 4 years ago
Updated 1 year ago
Dive-into-DL-PyTorch
by
ShusenTang
0.1%
19k
PyTorch rewrite of "Dive into Deep Learning" book
Created 6 years ago
Updated 3 years ago
alpa
by
alpa-projects
0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Starred by
+17
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.