llm.mojo  by dorjeduck

Mojo port of Karpathy's llm.c for GPT-2 training

Created 1 year ago
358 stars

Top 78.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project ports Andrej Karpathy's llm.c to Mojo, aiming to demonstrate Mojo's performance and low-level capabilities for C-like applications. It's targeted at developers interested in high-performance AI model implementation and systems programming, offering a potential speed advantage over C with OpenMP.

How It Works

The project directly translates the C implementation of a GPT-2 model training loop into Mojo. It leverages Mojo's features, including its Python-like syntax, static typing, and low-level memory management capabilities, to achieve performance comparable to or exceeding optimized C code. The use of vectorize and unroll_factor optimizations is highlighted.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Run preparatory scripts: python prepro_tinyshakespeare.py and python train_gpt2.py
  • Requires Modular's magic CLI tool.
  • Run training: magic shell then mojo train_gpt2.mojo
  • Detailed usage: https://github.com/dorjeduck/llm.mojo/blob/main/usage.md

Highlighted Details

  • Benchmarks on an M2 MacBook Pro show train_gpt2.mojo achieving 1819ms loop time, slightly faster than train_gpt2.c with OpenMP (1849ms) and significantly faster than C without OpenMP (7473ms).
  • Includes a ported test suite (test_gpt2.mojo) for validation.
  • Actively updated to track Mojo language releases.

Maintenance & Community

The project is primarily a proof of concept, with development focused on keeping pace with Mojo updates. The author is open to collaboration.

Licensing & Compatibility

  • License: MIT
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

The project is currently in beta and serves as a proof of concept, with no further development planned beyond Mojo version compatibility.

Health Check
Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

KernelBench by ScalingIntelligence

1.2%
643
Benchmark for LLMs generating GPU kernels from PyTorch ops
Created 1 year ago
Updated 7 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.0%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.