AdderBoard  by anadim

Minimal transformer for multi-digit addition

Created 1 month ago
345 stars

Top 80.4% on SourcePulse

GitHubView on GitHub
Project Summary

This repository presents the AdderBoard challenge: to engineer the smallest possible autoregressive transformer capable of adding two 10-digit numbers with over 99% accuracy. It targets researchers and engineers interested in extreme model compression, efficient transformer architectures, and exploring the fundamental capabilities of transformers beyond natural language processing. The primary benefit is advancing the state-of-the-art in minimal, performant neural network designs for specific computational tasks.

How It Works

The project frames integer addition as a sequence-to-sequence problem for transformers. Models must autonomously learn digit alignment, per-digit arithmetic, and carry propagation through self-attention, MLPs, and autoregressive generation, without hardcoded logic in their forward pass. This necessitates innovative approaches to tokenization, data formatting, and architectural design to minimize parameter counts. Submissions are categorized into "Trained" (weights learned via generic algorithms) and "Hand-coded" (weights analytically determined for constructive proof).

Quick Start & Requirements

  • Primary Install/Run: No direct installation command provided. Submissions are typically verified using a provided verify.py script.
  • Prerequisites: Python.
  • Dependencies: Standard machine learning libraries are implied.
  • Links: verify.py script for testing submissions.

Highlighted Details

  • Achieves >= 99% accuracy on a held-out 10K test set of 10-digit number additions.
  • Leaderboards showcase "Hand-coded" models reaching as few as 12 parameters and "Trained" models around 67 parameters.
  • Key techniques include rank-1/low-rank projections, factorized embeddings, custom positional encodings (e.g., RoPE, ALiBi), weight tying, and curriculum learning.
  • Notable findings include a "parameter cliff" around 800 parameters for trained models and convergence on specific layer dimensions (d=4, d=7).

Maintenance & Community

  • Maintained by Dimitris Papailiopoulos (@dimitrispapail).
  • No explicit community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: The MIT license permits commercial use and integration into closed-source projects.

Limitations & Caveats

The challenge is strictly defined for 10-digit integer addition. Models must adhere to a pure autoregressive transformer definition, forbidding task-specific control flow within the model's forward pass. Some leaderboard entries are marked as preliminary, requiring independent verification on the 10K test suite.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
7
Star History
42 stars in the last 30 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0%
1k
Transformer library for flexible model development
Created 4 years ago
Updated 1 year ago
Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Albert Gu Albert Gu(Cofounder of Cartesia; Professor at CMU), and
2 more.

Muon by KellerJordan

0.8%
2k
Optimizer for neural network hidden layers
Created 1 year ago
Updated 2 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
5 more.

matmulfreellm by ridgerchu

0.0%
3k
MatMul-free language models
Created 1 year ago
Updated 4 months ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

0.8%
2k
Speculative decoding research paper for faster LLM inference
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.