cs336_note_and_hw by weiruihhh

LLM Development: Course Notes and Practical Assignments

Created 1 year ago

973 stars

Top 37.3% on SourcePulse

Project Summary

Summary

This repository compiles personal notes and completed homework from Stanford's CS336 course on Building Large Language Models. It offers a comprehensive study resource for understanding LLM fundamentals, from core architectures to advanced training and optimization techniques.

How It Works

The project systematically documents the construction of LLMs from scratch. It covers foundational concepts like Transformer architecture, tokenization, and embeddings, alongside practical implementation details. Key areas include parallel optimization with Triton (e.g., Flash Attention, data parallelism), scaling laws, data preprocessing, and advanced RL applications.

Quick Start & Requirements

This repository serves as an educational resource, not a deployable application. Users can clone it to access notes and homework code for learning. Specific technical prerequisites for running the homework code (e.g., Python, ML frameworks, CUDA) are not detailed in the README but would typically align with standard ML development environments.

Highlighted Details

Comprehensive notes and code solutions for Stanford's CS336: Building Large Language Models.
Completed homeworks cover Transformer architecture, parallel optimization with Triton (Flash Attention), and scaling laws.
Includes implementations for data cleaning pipelines and advanced RL algorithms (GRPO).
Acts as a personal learning log for LLM construction.

Maintenance & Community

This is a personal project documenting a specific course. A QQ group (1039207477) is provided for discussion. No formal maintainers or dedicated community channels (like Slack/Discord) are indicated.

Licensing & Compatibility

Licensed under CC BY-NC-SA 4.0. It permits non-commercial use, modification, and sharing with attribution and under the same license. Commercial use is prohibited.

Limitations & Caveats

This repository is an educational resource, not a production library. HW4 (Data Cleaning) and HW5 (LLM + RL) are marked as "to be updated," indicating incomplete coverage. Content is specific to the CS336 curriculum and may not reflect the latest LLM development practices.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

71 stars in the last 30 days