fcc-intro-to-llms  by Infatoshi

Colab for building LLMs from scratch

Created 2 years ago
782 stars

Top 44.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a Google Colab notebook for learning how to build Large Language Models (LLMs) from scratch, targeting individuals who want to understand LLM internals without requiring local GPU hardware. It offers a hands-on approach to LLM development and training.

How It Works

The project utilizes PyTorch for LLM implementation and training, with a fallback to CPU execution for users without NVIDIA GPUs. It includes code for handling data loading, model architecture, and training loops, abstracting away much of the complexity of building LLMs from the ground up. The approach is designed to be educational, allowing users to experiment with core LLM concepts.

Quick Start & Requirements

  • Install: pip install pylzma numpy ipykernel jupyter torch --index-url https://download.pytorch.org/whl/cu118
  • Prerequisites: Visual Studio 2022 (for lzma compression), OpenWebText dataset (or a mini dataset like Wizard of Oz).
  • Hardware: NVIDIA GPU recommended for faster runtimes; CPU is supported but slower.
  • Links: Google Colab Notebook: https://colab.research.google.com/drive/1_7TNpEEl8xjHlr9JzKbK5AuDKXwAkHqj?usp=sharing

Highlighted Details

  • Focuses on building LLMs from scratch, not just fine-tuning existing models.
  • Provides a Google Colab environment for accessibility without local GPU setup.
  • Includes links to foundational research papers like "Attention is All You Need."
  • Offers guidance on setting up development environments like Jupyter Notebooks.

Maintenance & Community

The project is associated with FreeCodeCamp and the author, Elliot Arledge, who shares content on Twitter/X, YouTube, and LinkedIn. A Discord server is available for community interaction.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The README mentions that detailed explanations will be added as questions and issues are posted, suggesting the content may be evolving. Performance will be significantly slower on CPU-only machines.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 30 days

Explore Similar Projects

Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 11 months ago
Updated 2 months ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
15 more.

torchtune by pytorch

0.2%
5k
PyTorch library for LLM post-training and experimentation
Created 1 year ago
Updated 1 day ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Roy Frostig Roy Frostig(Coauthor of JAX; Research Scientist at Google DeepMind), and
19 more.

ml-engineering by stas00

0.4%
15k
Open book for LLM/VLM training engineers
Created 5 years ago
Updated 1 day ago
Feedback? Help us improve.