nn-zero-to-hero by karpathy

Educational resource for neural network development, from basics to advanced models

Created 3 years ago

19,679 stars

Top 2.3% on SourcePulse

View on GitHub

10 Experts Love This Project

Cofounder of Langflow

and 6 more!

Project Summary

This repository provides a comprehensive, hands-on course for learning neural networks from fundamental principles to advanced architectures like Transformers. Aimed at individuals with basic Python and high school calculus knowledge, it offers a step-by-step coding journey through YouTube videos and accompanying Jupyter notebooks, enabling practical understanding and implementation of key concepts.

How It Works

The course progresses from building a micro-neural network from scratch (micrograd) to implementing character-level language models (makemore) and culminating in a full GPT implementation. It emphasizes practical coding, explaining concepts like backpropagation, tensors, activation functions, batch normalization, and attention mechanisms through direct implementation in PyTorch. This approach fosters deep intuition by demystifying the underlying mechanics of neural network training and inference.

Quick Start & Requirements

Install: Clone the repository and install dependencies via pip install -r requirements.txt (specific requirements vary per lecture, often including torch, numpy, matplotlib).
Prerequisites: Python 3.x, basic Python programming, high school calculus.
Resources: Jupyter notebooks can be run locally or via Google Colab. Some lectures may benefit from GPU acceleration for larger models.
Links: YouTube Playlist, micrograd repo, makemore repo.

Highlighted Details

Builds foundational libraries like micrograd for understanding backpropagation.
Implements character-level language models, progressing to MLPs, CNNs (WaveNet-like), and Transformers (GPT).
Covers essential ML concepts: training, hyperparameters, overfitting, batch normalization, and tokenization.
Demonstrates manual backpropagation and tensor manipulation for deep learning intuition.

Maintenance & Community

The project is primarily driven by Andrej Karpathy. Community interaction is largely through YouTube comments and associated discussions.

Licensing & Compatibility

License: MIT.
Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The course is structured around video lectures, and while notebooks are provided, the primary learning path is video-centric. Some advanced topics like residual connections and Adam optimizer are noted as future additions or left for self-study.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

556 stars in the last 30 days