LLMs-Zero-to-Hero by bbruceyuan

Tutorial for building LLMs from scratch

Created 1 year ago

1,932 stars

Top 22.4% on SourcePulse

Project Summary

This repository provides a comprehensive, hands-on guide to mastering Large Language Models (LLMs) from scratch. It targets engineers and researchers aiming to understand and implement LLM training, fine-tuning, and deployment, offering a structured learning path with accompanying video tutorials.

How It Works

The project emphasizes a "from scratch" implementation approach, mirroring Andrej Karpathy's educational style. It covers foundational LLM concepts, dense models, Mixture-of-Experts (MoE) architectures, and various fine-tuning techniques like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF). The code is designed to be educational, with explanations integrated into the development process.

Quick Start & Requirements

Installation: Code is provided within the src/ directory and organized by chapter. Notebooks are available for direct execution.
Prerequisites: A GPU is required for training, with a minimum recommendation of NVIDIA 3090 or 4090.
Resources: The project offers GPU discount coupons via an AIStackDC registration link.
Documentation: Accompanying video lectures are available on Bilibili, linked within the chapter descriptions.

Highlighted Details

End-to-end LLM training and fine-tuning from scratch.
Detailed explanations of MoE architectures, including DeepSeek's MLA algorithm.
Coverage of activation function evolution and inference optimization techniques.
Dedicated sections for code-LLM development and LLM deployment.

Maintenance & Community

The project is actively developed by bbruceyuan, with community engagement encouraged via WeChat, a personal blog, and a public WeChat account.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the provided README.

Limitations & Caveats

Some sections, such as the nanoGPT implementation and activation function optimization, are marked as "todo," indicating incomplete content. The project is presented as an ongoing learning resource.

LLMs-Zero-to-Hero by bbruceyuan

Explore Similar Projects

awesome-llm-pretraining by RUCAIBox

llm-course-chn by friendmine

intro-llm-code by intro-llm

Seed-Coder by ByteDance-Seed

tiny-llm-zh by wdndev

LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing by ghimiresunil

Transformer-from-scratch by waylandzhang

LLM-workshop-2024 by rasbt

Generative-AI-with-LLMs by Ryota-Kawamura

llms-from-scratch-cn by datawhalechina

llm-action by liguodongiot

llm-course by mlabonne