happy-llm  by datawhalechina

LLM tutorial from scratch

Created 1 year ago
17,537 stars

Top 2.7% on SourcePulse

GitHubView on GitHub
Project Summary

Happy-LLM is a free, comprehensive tutorial for understanding and implementing Large Language Models (LLMs) from scratch. It targets university students, researchers, and AI enthusiasts with programming experience, aiming to demystify LLM principles and practical training. The project provides a structured learning path from NLP fundamentals to advanced applications like RAG and Agents.

How It Works

The tutorial systematically breaks down LLMs, starting with foundational NLP concepts and progressing to the Transformer architecture, including attention mechanisms. It then covers pre-training language models (PLMs) and the specific characteristics of LLMs, such as emergent abilities and training strategies. A key feature is the hands-on implementation of a LLaMA2 model using PyTorch, covering tokenizer training and pre-training.

Quick Start & Requirements

  • Installation: Primarily a learning resource; code examples are provided within the tutorial.
  • Prerequisites: Python programming experience, familiarity with deep learning and NLP concepts recommended.
  • Resources: Code examples are designed to be runnable, with specific model weights available for download (e.g., Happy-LLM-Chapter5-Base-215M, Happy-LLM-Chapter5-SFT-215M).
  • Links: PDF Download, ModelScope

Highlighted Details

  • Free and open-source educational content.
  • Hands-on implementation of a LLaMA2 model.
  • Covers pre-training, supervised fine-tuning, and efficient fine-tuning methods (LoRA/QLoRA).
  • Explores advanced applications like RAG and Agents.

Maintenance & Community

  • Led by Datawhale members and advised by academic experts.
  • Open to contributions via Issues, feature suggestions, content improvements, and Pull Requests.
  • Community engagement encouraged through Datawhale's platform.

Licensing & Compatibility

  • Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
  • Non-commercial use is permitted, but commercial use or distribution without attribution and share-alike is restricted.

Limitations & Caveats

The tutorial is a learning resource, not a production-ready framework. Chapter 6 on advanced training practices is marked as "in progress" (🚧). The licensing restricts commercial applications.

Health Check
Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
15
Star History
2,261 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
2 more.

Hands-On-Large-Language-Models by HandsOnLLM

1.4%
16k
Code examples for "Hands-On Large Language Models" book
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.