fcc-intro-to-llms by Infatoshi

Colab for building LLMs from scratch

Created 2 years ago

809 stars

Top 43.7% on SourcePulse

Project Summary

This repository provides a Google Colab notebook for learning how to build Large Language Models (LLMs) from scratch, targeting individuals who want to understand LLM internals without requiring local GPU hardware. It offers a hands-on approach to LLM development and training.

How It Works

The project utilizes PyTorch for LLM implementation and training, with a fallback to CPU execution for users without NVIDIA GPUs. It includes code for handling data loading, model architecture, and training loops, abstracting away much of the complexity of building LLMs from the ground up. The approach is designed to be educational, allowing users to experiment with core LLM concepts.

Quick Start & Requirements

Install: pip install pylzma numpy ipykernel jupyter torch --index-url https://download.pytorch.org/whl/cu118
Prerequisites: Visual Studio 2022 (for lzma compression), OpenWebText dataset (or a mini dataset like Wizard of Oz).
Hardware: NVIDIA GPU recommended for faster runtimes; CPU is supported but slower.
Links: Google Colab Notebook: https://colab.research.google.com/drive/1_7TNpEEl8xjHlr9JzKbK5AuDKXwAkHqj?usp=sharing

Highlighted Details

Focuses on building LLMs from scratch, not just fine-tuning existing models.
Provides a Google Colab environment for accessibility without local GPU setup.
Includes links to foundational research papers like "Attention is All You Need."
Offers guidance on setting up development environments like Jupyter Notebooks.

Maintenance & Community

The project is associated with FreeCodeCamp and the author, Elliot Arledge, who shares content on Twitter/X, YouTube, and LinkedIn. A Discord server is available for community interaction.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The README mentions that detailed explanations will be added as questions and issues are posted, suggesting the content may be evolving. Performance will be significantly slower on CPU-only machines.

fcc-intro-to-llms by Infatoshi

Explore Similar Projects

LLaMA-Cult-and-More by shm007g

ToolkenGPT by Ber666

OpenMoE by XueFuzhao

train-llm-from-scratch by FareedKhan-dev

lingua by facebookresearch

LLM-workshop-2024 by rasbt

instructlab by instructlab

llms-from-scratch-cn by datawhalechina

torchtune by meta-pytorch

oumi by oumi-ai

ml-engineering by stas00

llm-action by liguodongiot