happy-llm  by datawhalechina

LLM tutorial from scratch

Created 1 year ago
24,017 stars

Top 1.7% on SourcePulse

GitHubView on GitHub
Project Summary

Happy-LLM is a free, comprehensive tutorial for understanding and implementing Large Language Models (LLMs) from scratch. It targets university students, researchers, and AI enthusiasts with programming experience, aiming to demystify LLM principles and practical training. The project provides a structured learning path from NLP fundamentals to advanced applications like RAG and Agents.

How It Works

The tutorial systematically breaks down LLMs, starting with foundational NLP concepts and progressing to the Transformer architecture, including attention mechanisms. It then covers pre-training language models (PLMs) and the specific characteristics of LLMs, such as emergent abilities and training strategies. A key feature is the hands-on implementation of a LLaMA2 model using PyTorch, covering tokenizer training and pre-training.

Quick Start & Requirements

  • Installation: Primarily a learning resource; code examples are provided within the tutorial.
  • Prerequisites: Python programming experience, familiarity with deep learning and NLP concepts recommended.
  • Resources: Code examples are designed to be runnable, with specific model weights available for download (e.g., Happy-LLM-Chapter5-Base-215M, Happy-LLM-Chapter5-SFT-215M).
  • Links: PDF Download, ModelScope

Highlighted Details

  • Free and open-source educational content.
  • Hands-on implementation of a LLaMA2 model.
  • Covers pre-training, supervised fine-tuning, and efficient fine-tuning methods (LoRA/QLoRA).
  • Explores advanced applications like RAG and Agents.

Maintenance & Community

  • Led by Datawhale members and advised by academic experts.
  • Open to contributions via Issues, feature suggestions, content improvements, and Pull Requests.
  • Community engagement encouraged through Datawhale's platform.

Licensing & Compatibility

  • Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
  • Non-commercial use is permitted, but commercial use or distribution without attribution and share-alike is restricted.

Limitations & Caveats

The tutorial is a learning resource, not a production-ready framework. Chapter 6 on advanced training practices is marked as "in progress" (🚧). The licensing restricts commercial applications.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
3
Star History
1,432 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
3 more.

Hands-On-Large-Language-Models by HandsOnLLM

0.8%
20k
Code examples for "Hands-On Large Language Models" book
Created 1 year ago
Updated 3 weeks ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Michael Han Michael Han(Cofounder of Unsloth), and
18 more.

llm-course by mlabonne

0.8%
73k
LLM course with roadmaps and notebooks
Created 2 years ago
Updated 2 weeks ago
Feedback? Help us improve.