AI-Infra-from-Zero-to-Hero  by HuaizhengZhang

Curated list of machine learning systems resources

created 6 years ago
3,149 stars

Top 15.6% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository curates resources for building and understanding machine learning and large language model (LLM) systems, targeting engineers and researchers interested in production-grade AI infrastructure. It provides a structured overview of key concepts, papers, and tools across various ML system categories, aiming to guide users from foundational knowledge to advanced industry practices.

How It Works

The project acts as a comprehensive knowledge base, organizing links to academic papers, conference talks, books, courses, and blog posts. It categorizes resources by ML system components such as training, inference, data processing, and LLM-specific infrastructure, facilitating a systematic learning path. The inclusion of links to code repositories where available allows for practical exploration of implemented concepts.

Quick Start & Requirements

This is a curated list of resources, not a software package. No installation or specific requirements are needed beyond internet access to view the linked content.

Highlighted Details

  • Extensive coverage of major AI/ML systems conferences (OSDI, NSDI, MLSys, etc.).
  • Includes links to foundational papers like "Hidden technical debt in machine learning systems."
  • Features resources on LLM infrastructure, training, and serving.
  • Provides links to video tutorials and courses from leading universities and industry experts.

Maintenance & Community

The project is actively maintained by a team and welcomes pull requests. Links to YouTube, Bilibili, and Xiaohongshu are provided for video content. A new website is under development.

Licensing & Compatibility

The repository itself is a collection of links and does not have a specific software license. The licenses of the linked resources vary.

Limitations & Caveats

As a curated list, the depth of coverage for each topic can vary, and the project does not provide direct implementation or support for the linked resources. The rapidly evolving nature of AI systems means some linked content may become outdated.

Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
272 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.