parameter-golf  by openai

Extreme parameter-constrained LM training challenge

Created 1 month ago
4,195 stars

Top 11.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

The OpenAI Parameter Golf challenge tasks participants with training the most effective language model within a strict 16MB artifact size and a 10-minute training window on 8x H100 GPUs. It targets researchers and engineers focused on extreme model compression and efficient training, fostering innovation in novel architectures and optimization techniques. Success offers a unique opportunity to showcase technical skill and potentially lead to recruitment by OpenAI.

How It Works

This challenge frames model optimization as an L(N) problem: achieving the lowest validation loss with a fixed parameter count (N). Participants must innovate beyond standard practices, exploring techniques like aggressive parameter tying, depth recurrence, low-rank training, quantization-aware training (QAT), bitnets, and novel tokenizers. The core novelty lies in forcing extreme efficiency and creativity under strict size and compute budgets, prioritizing compression and speed alongside predictive performance.

Quick Start & Requirements

Clone the repository and set up a Python environment. For local development on Apple Silicon, install MLX and related packages; download the FineWeb dataset. A basic MLX training run can be initiated locally. For scaling, cloud GPU providers like Runpod are recommended, using train_gpt.py with PyTorch. Final submissions must train in <10 min on 8x H100s, evaluate in <10 min, and fit a 16MB artifact (code + model). Compute grants are available via a form, and a specific Runpod template is mentioned.

Highlighted Details

  • Objective: Lowest validation loss (bits per byte on FineWeb) within a 16MB artifact and 10-minute training limit on 8x H100s.
  • Evaluation Metric: Tokenizer-agnostic compression performance on FineWeb validation set.
  • Compute Grant: OpenAI provides $1,000,000 in compute credits.
  • Recruitment Opportunity: Exceptional performance may lead to early-career researcher opportunities at OpenAI.

Maintenance & Community

Adapts code from modded-nanogpt. Community discussions on OpenAI Discord (#parameter-golf-discussions, #parameter-golf-announcements). Challenge runs March 18th - April 30th. Links to the leaderboard, participant form, and compute grant request form are mentioned but not provided.

Licensing & Compatibility

The specific license is not detailed in the provided README text. Compatibility is primarily defined by the strict hardware and time constraints for submission.

Limitations & Caveats

Submissions are strictly time-limited for training/evaluation. Top results require manual verification; non-reproducible submissions risk disqualification. External compute beyond standard tuning is prohibited. The 16MB artifact must be self-contained, disallowing external data access or network calls during evaluation.

Health Check
Last Commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)
775
Issues (30d)
31
Star History
4,290 stars in the last 30 days

Explore Similar Projects

Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Albert Gu Albert Gu(Cofounder of Cartesia; Professor at CMU), and
2 more.

Muon by KellerJordan

1.2%
2k
Optimizer for neural network hidden layers
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.