LLM101n  by karpathy

Educational resource for building a Storyteller AI LLM

Created 1 year ago
34,406 stars

Top 1.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository outlines a comprehensive, end-to-end course for building a Storyteller AI Large Language Model (LLM) from scratch. Aimed at individuals seeking a deep understanding of AI and LLMs, it guides users through creating, refining, and illustrating stories with an AI, culminating in a ChatGPT-like web application.

How It Works

The course progresses from fundamental concepts like Bigram Language Models and micrograd for backpropagation to advanced topics including attention mechanisms, Transformer architectures (GPT-2), tokenization (minBPE), optimization techniques (AdamW), and speed enhancements via device utilization, mixed precision, and distributed training. It covers dataset handling, inference optimizations (kv-cache, quantization), various finetuning methods (SFT, RLHF), and deployment strategies.

Quick Start & Requirements

This is a course syllabus and does not contain runnable code. The development is ongoing by Eureka Labs.

Highlighted Details

  • Covers building LLMs from scratch using Python, C, and CUDA.
  • Includes chapters on optimization techniques for speed and efficiency (device, precision, distributed training).
  • Details inference optimizations like kv-cache and quantization.
  • Explores supervised and reinforcement learning-based finetuning methods.

Maintenance & Community

This repository is currently archived as the course is under development by Eureka Labs. Further details on community or roadmap are not yet available.

Licensing & Compatibility

The repository is archived and does not specify a license.

Limitations & Caveats

The course content is under development and not yet available. The repository is archived, indicating the project is not actively maintained in its current state.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
236 stars in the last 30 days

Explore Similar Projects

Starred by Elvis Saravia Elvis Saravia(Founder of DAIR.AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

awesome-transformer-nlp by cedrickchee

0%
1k
Curated list of NLP resources for Transformer networks
Created 6 years ago
Updated 10 months ago
Feedback? Help us improve.