LLM-workshop-2024  by rasbt

Coding workshop for understanding LLM implementation and usage

created 1 year ago
995 stars

Top 38.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides materials for a 4-hour coding workshop focused on understanding and implementing Large Language Models (LLMs) from scratch. It targets coders interested in the foundational building blocks, architecture, and practical application of LLMs, enabling them to build, pretrain, and finetune their own models.

How It Works

The workshop guides participants through coding a GPT-like LLM using PyTorch. It covers essential components: data input pipelines (tokenization, DataLoaders), core architectural elements, and the pretraining process. Subsequently, it demonstrates loading pretrained weights and finetuning LLMs using the LitGPT library, offering a practical bridge from foundational concepts to real-world usage.

Quick Start & Requirements

  • A ready-to-go cloud environment with all code and dependencies is available via a provided link, enabling GPU execution.
  • Local setup instructions are also available in the setup folder.
  • Requires PyTorch and the LitGPT library.

Highlighted Details

  • Hands-on coding of a GPT-like LLM architecture.
  • Practical demonstration of LLM pretraining on a small dataset.
  • Instruction on loading weights from popular LLMs (Llama, Phi, Gemma, Mistral) using LitGPT.
  • Covers finetuning techniques, including instruction finetuning.

Maintenance & Community

  • The code material is based on the author's "Build a Large Language Model From Scratch" book.
  • Utilizes the LitGPT open-source library.

Licensing & Compatibility

  • The repository does not explicitly state a license. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The pretraining section uses a small text sample, meaning the self-built LLM will only generate basic sentences. The workshop focuses on understanding the core mechanics rather than achieving state-of-the-art performance.

Health Check
Last commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
75 stars in the last 90 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google), Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), and
4 more.

LLMs-from-scratch by rasbt

1.4%
61k
Educational resource for LLM construction in PyTorch
created 2 years ago
updated 1 day ago
Feedback? Help us improve.