LLM-workshop-2024 by rasbt

Coding workshop for understanding LLM implementation and usage

Created 1 year ago

1,061 stars

Top 35.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This repository provides materials for a 4-hour coding workshop focused on understanding and implementing Large Language Models (LLMs) from scratch. It targets coders interested in the foundational building blocks, architecture, and practical application of LLMs, enabling them to build, pretrain, and finetune their own models.

How It Works

The workshop guides participants through coding a GPT-like LLM using PyTorch. It covers essential components: data input pipelines (tokenization, DataLoaders), core architectural elements, and the pretraining process. Subsequently, it demonstrates loading pretrained weights and finetuning LLMs using the LitGPT library, offering a practical bridge from foundational concepts to real-world usage.

Quick Start & Requirements

A ready-to-go cloud environment with all code and dependencies is available via a provided link, enabling GPU execution.
Local setup instructions are also available in the setup folder.
Requires PyTorch and the LitGPT library.

Highlighted Details

Hands-on coding of a GPT-like LLM architecture.
Practical demonstration of LLM pretraining on a small dataset.
Instruction on loading weights from popular LLMs (Llama, Phi, Gemma, Mistral) using LitGPT.
Covers finetuning techniques, including instruction finetuning.

Maintenance & Community

The code material is based on the author's "Build a Large Language Model From Scratch" book.
Utilizes the LitGPT open-source library.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The pretraining section uses a small text sample, meaning the self-built LLM will only generate basic sentences. The workshop focuses on understanding the core mechanics rather than achieving state-of-the-art performance.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

15 stars in the last 30 days