cifar10-airbench  by KellerJordan

Fast CIFAR-10 training benchmarks

created 1 year ago
274 stars

Top 95.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides highly optimized PyTorch scripts for training neural networks on the CIFAR-10 dataset, achieving state-of-the-art speed benchmarks. It targets researchers and practitioners seeking to establish fast, reproducible baselines for image classification tasks, offering significant speedups over standard training methods.

How It Works

The project leverages custom optimizations, including the Muon optimizer and data filtering techniques, to drastically reduce training time. These methods are designed to accelerate convergence without sacrificing accuracy, making them suitable for rapid experimentation and baseline establishment. The core advantage lies in the aggressive optimization of the training loop and data loading pipeline for maximum GPU utilization.

Quick Start & Requirements

  • Install: git clone https://github.com/KellerJordan/cifar10-airbench.git && cd airbench && python airbench94_muon.py
  • Requirements: PyTorch (torch), Torchvision (torchvision).
  • Hardware: NVIDIA A100 GPU recommended for achieving stated benchmarks.
  • Documentation: Official Quickstart

Highlighted Details

  • Achieves 94% accuracy on CIFAR-10 in 2.6 seconds and 96% in 27 seconds on an NVIDIA A100.
  • Significantly faster than standard ResNet-18 training (7 minutes for 96%).
  • Includes a GPU-accelerated dataloader for custom experiments.
  • Demonstrates data-selection strategies for improved training signal.

Maintenance & Community

The project appears to be a personal research effort by Keller Jordan. No specific community channels or roadmap are indicated in the README.

Licensing & Compatibility

The repository does not explicitly state a license. This is a significant omission for evaluating commercial use or integration into closed-source projects.

Limitations & Caveats

The primary limitation is the lack of a specified license, which hinders clear understanding of usage rights. The benchmarks are specific to NVIDIA A100 hardware, and achieving similar speeds on other GPUs may not be possible. The project is presented as a set of optimized scripts rather than a comprehensive library.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
44 stars in the last 90 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
3 more.

modded-nanogpt by KellerJordan

2.6%
3k
Language model training speedrun on 8x H100 GPUs
created 1 year ago
updated 2 weeks ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.