stanford-cme-295-transformers-large-language-models  by afshinea

Cheatsheet for Stanford's Transformers & LLMs course

created 4 months ago
2,267 stars

Top 20.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive cheatsheet for Stanford's CME 295 course on Transformers and Large Language Models. It aims to consolidate key concepts for students and practitioners in NLP and deep learning, offering a structured overview of essential topics.

How It Works

The cheatsheet summarizes core concepts from the "Super Study Guide: Transformers & Large Language Models" book, which features extensive illustrations. It covers transformer architectures, attention mechanisms, optimization techniques, LLM fine-tuning methods, and applications like RAG and agents.

Highlighted Details

  • Covers transformer variants and optimization techniques like sparse, low-rank, and FlashAttention.
  • Details LLM fine-tuning methods including SFT, LoRA, and preference tuning.
  • Explains optimization techniques such as Mixture of Experts, distillation, and quantization.
  • Includes applications like LLM-as-a-judge, RAG, agents, and reasoning models.

Maintenance & Community

The project is authored by Afshine Amidi and Shervine Amidi, associated with Stanford University. Further details can be found on the course website: cme295.stanford.edu.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

This repository serves as a summary and reference guide, not a runnable codebase. It points to external resources for in-depth study.

Health Check
Last commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
870 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Nathan Lambert Nathan Lambert(AI Researcher at AI2), and
4 more.

large_language_model_training_playbook by huggingface

0%
478
Tips for training large language models
created 2 years ago
updated 2 years ago
Feedback? Help us improve.