how-to-learn-deep-learning-framework  by BBuf

Deep learning framework learning resources (PyTorch, OneFlow)

Created 2 years ago
452 stars

Top 66.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a learning resource for understanding the internal mechanisms of deep learning frameworks, primarily PyTorch and OneFlow. It targets engineers and researchers seeking to deepen their knowledge of framework design, performance optimization, and CUDA implementation, offering a comprehensive collection of articles and source code analyses.

How It Works

The repository compiles a vast array of articles, blog posts, and source code deep dives focused on PyTorch and OneFlow. It covers topics ranging from fundamental concepts like Tensor manipulation and autograd to advanced areas such as memory management, distributed training, CUDA kernel implementation, and compiler infrastructure (TorchScript, TorchDynamo). The content is structured to provide a systematic understanding of how these frameworks operate internally.

Quick Start & Requirements

  • Installation: No direct installation is required as this is a curated collection of learning materials.
  • Prerequisites: Familiarity with Python, deep learning concepts, and optionally C++/CUDA for deeper dives.
  • Resources: Access to the GitHub repository and potentially the linked articles/blogs.

Highlighted Details

  • Extensive coverage of PyTorch internals, including autograd, memory management, distributed training (DP/DDP), and optimization.
  • Detailed analysis of PyTorch 2.0 features like TorchDynamo and AOTAutograd.
  • In-depth exploration of OneFlow's architecture, including Global Tensor, operator implementation, and performance optimizations.
  • Articles on CUDA kernel optimization for specific operations like Softmax and LayerNorm.
  • Comparisons and integration points between PyTorch and JAX, and PyTorch FX.

Maintenance & Community

The repository is maintained by BBuf, with contributions from various individuals listed in the article titles (e.g., Xu Xiaoyu, Huang Zhuobin, Li Xiang). Links to related repositories for CUDA and deep learning compiler learning are provided.

Licensing & Compatibility

The repository itself is hosted on GitHub, implying a standard open-source license, likely MIT or Apache 2.0, though not explicitly stated in the provided text. The linked articles and frameworks (PyTorch, OneFlow) have their own respective licenses.

Limitations & Caveats

This repository is a collection of learning materials and does not represent a runnable framework itself. The depth and breadth of coverage may vary, and some articles might be outdated relative to the latest framework versions.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
13 more.

torchtitan by pytorch

0.7%
4k
PyTorch platform for generative AI model training research
Created 1 year ago
Updated 21 hours ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
20 more.

TensorRT-LLM by NVIDIA

0.5%
12k
LLM inference optimization SDK for NVIDIA GPUs
Created 2 years ago
Updated 14 hours ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Alexey Milovidov Alexey Milovidov(Cofounder of Clickhouse), and
29 more.

llm.c by karpathy

0.2%
28k
LLM training in pure C/CUDA, no PyTorch needed
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.