llm.mojo  by dorjeduck

Mojo port of Karpathy's llm.c for GPT-2 training

Created 1 year ago
361 stars

Top 78.1% on SourcePulse

GitHubView on GitHub
Project Summary

This project ports Andrej Karpathy's llm.c to Mojo, aiming to demonstrate Mojo's performance and low-level capabilities for C-like applications. It's targeted at developers interested in high-performance AI model implementation and systems programming, offering a potential speed advantage over C with OpenMP.

How It Works

The project directly translates the C implementation of a GPT-2 model training loop into Mojo. It leverages Mojo's features, including its Python-like syntax, static typing, and low-level memory management capabilities, to achieve performance comparable to or exceeding optimized C code. The use of vectorize and unroll_factor optimizations is highlighted.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Run preparatory scripts: python prepro_tinyshakespeare.py and python train_gpt2.py
  • Requires Modular's magic CLI tool.
  • Run training: magic shell then mojo train_gpt2.mojo
  • Detailed usage: https://github.com/dorjeduck/llm.mojo/blob/main/usage.md

Highlighted Details

  • Benchmarks on an M2 MacBook Pro show train_gpt2.mojo achieving 1819ms loop time, slightly faster than train_gpt2.c with OpenMP (1849ms) and significantly faster than C without OpenMP (7473ms).
  • Includes a ported test suite (test_gpt2.mojo) for validation.
  • Actively updated to track Mojo language releases.

Maintenance & Community

The project is primarily a proof of concept, with development focused on keeping pace with Mojo updates. The author is open to collaboration.

Licensing & Compatibility

  • License: MIT
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

The project is currently in beta and serves as a proof of concept, with no further development planned beyond Mojo version compatibility.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code), Andreas Jansson Andreas Jansson(Cofounder of Replicate), and
1 more.

llama2.mojo by tairov

0.1%
2k
Mojo code for Llama 2 inference
Created 2 years ago
Updated 2 weeks ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.1%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 2 years ago
Updated 5 days ago
Feedback? Help us improve.