InfraTech by CalvinXKY

Accelerating AI Infrastructure with practical code and deep dives

Created 6 months ago

2,366 stars

Top 18.8% on SourcePulse

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> The InfraTech repository serves as a practical educational resource for AI Infrastructure (Infra) knowledge, targeting engineers and researchers. It offers Python notebooks and articles covering large model training/inference frameworks (PyTorch, vLLM, SGLang), performance optimization, and hardware fundamentals. The project accelerates learning in AI Infra through hands-on code examples and clear explanations of complex topics.

How It Works

InfraTech utilizes a learn-by-doing methodology, presenting AI Infra concepts via executable Python notebooks and detailed articles. It dissects complex areas like attention mechanisms, inference optimization (speculative decoding, KV caching), and distributed systems. The focus is on practical implementation, often involving reimplementing core components (e.g., vLLM scheduler) or visualizing internal workings (e.g., PyTorch memory) for deep comprehension.

Quick Start & Requirements

Installation: Clone the repository; requires a Python environment with Jupyter Notebooks/Lab.
Prerequisites: Python, relevant framework packages (PyTorch, vLLM, SGLang). Performance topics may require NVIDIA GPUs, CUDA, and NCCL.
Links:
- Author's Zhihu: https://www.zhihu.com/people/xky7
- BasicCUDA Repo: https://github.com/CalvinXKY/BasicCUDA
- WeChat Public Account: "InfraTech"

Highlighted Details

In-depth explorations of inference optimizations: ChunkedPrefill, FlashDecoding, speculative decoding.
Detailed walkthroughs and reimplementations of vLLM (scheduler, memory) and SGLang (RadixAttention) components.
Practical code for advanced concepts: LoRA to Multi-LoRA, parallelization strategies (PD separation, AFD, EPLB).
Tools for analyzing LLM memory, MFU (Model FLOPs Utilization), and PyTorch computation graphs.

Maintenance & Community

Maintained by CalvinXKY, with links to the author's Zhihu and WeChat public account ("InfraTech"). A related BasicCUDA GitHub repo exists. No dedicated community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README content, making commercial use or closed-source linking compatibility undetermined.

Limitations & Caveats

This repository is primarily an educational resource, not a production-ready library. Notebooks marked as "practice" may be simplified implementations for learning. The absence of explicit licensing is a significant adoption blocker.

InfraTech by CalvinXKY

Explore Similar Projects

MagiCompiler by SandAI-org

Instella by AMD-AGI

AIInfraGuide by caomaolufei

cookbook by EleutherAI

awesomeMLSys by gpu-mode

ai-infra-hpc by jinbooooom

llm-internals by amitshekhariitbhu

lightning-thunder by Lightning-AI

efficient-dl-systems by mryab

ai-performance-engineering by cfregly

lingua by facebookresearch

openvino by openvinotoolkit