grok by openai

Code for Grokking research paper

Created 4 years ago

4,209 stars

Top 11.6% on SourcePulse

2 Experts Love This Project

victortaelin

Author of Bend, Kind, HVM

jinze1994

Research Scientist at Alibaba Qwen

Project Summary

This repository provides the code and experimental setup for the "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets" paper. It allows researchers and practitioners to reproduce and extend experiments on the grokking phenomenon, where models generalize unexpectedly after a period of overfitting.

How It Works

The project implements training procedures for small algorithmic datasets, focusing on the "grokking" effect. It likely utilizes standard deep learning frameworks and techniques to train models, observing their generalization behavior over extended training periods. The core novelty lies in the experimental setup designed to isolate and study this specific generalization phenomenon.

Quick Start & Requirements

Primary install / run command:
```
pip install -e .
./scripts/train.py
```
Prerequisites: Python, standard deep learning libraries (likely PyTorch or TensorFlow, though not explicitly stated).

Highlighted Details

Codebase for the "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets" paper.
Focuses on studying the grokking phenomenon in deep learning models.

Maintenance & Community

Developed by OpenAI researchers. No community links or roadmap are provided in the README.

Licensing & Compatibility

The license is not specified in the README.

Limitations & Caveats

The README is extremely minimal, lacking details on specific dependencies, hardware requirements (e.g., GPU, CUDA), dataset availability, or the exact model architectures used. The scope appears limited to reproducing the paper's experiments.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

2

Star History

16 stars in the last 30 days

Explore Similar Projects

Starred by

Sebastian Raschka

Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)").

mint by dpressel

Minimal PyTorch library for Transformer tutorials

Created 3 years ago

Updated 3 years ago

Starred by

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai).

simplified_transformers by bobby-he

Research paper implementation for simplifying transformer blocks

Created 2 years ago

Updated 1 year ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera),

Yiran Wu

Yiran Wu(Coauthor of AutoGen), and

2 more.

aimo-progress-prize by project-numina

Code for replicating a math problem-solving solution

Created 1 year ago

Updated 1 year ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm) and

Jesse Clark

Jesse Clark(Cofounder of Marqo).

grokfast by ironjr

Research paper for accelerated grokking via gradient amplification

Created 1 year ago

Updated 1 year ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n).

GRPO-Zero by policy-gradient

Minimalist GRPO trainer for language models

Created 9 months ago

Updated 8 months ago

ColossalAI-Examples by hpcaitech

Examples for training models with hybrid parallelism using ColossalAI

Created 4 years ago

Updated 2 years ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

5 more.

naacl_transfer_learning_tutorial by huggingface

NLP transfer learning tutorial code

Created 6 years ago

Updated 6 years ago

Transformer-from-scratch by waylandzhang

LLM training demo with ~240 lines of code

Created 1 year ago

Updated 1 year ago

Bert-Multi-Label-Text-Classification by lonePatient

PyTorch code for multi-label text classification

Created 7 years ago

Updated 2 years ago

Starred by

Luca Antiga

Luca Antiga(CTO of Lightning AI).

info8010-deep-learning by glouppe

Deep learning lectures and assignments

Created 7 years ago

Updated 5 days ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI),

Lewis Tunstall

Lewis Tunstall(Research Engineer at Hugging Face), and

15 more.

torchtune by meta-pytorch

PyTorch library for LLM post-training and experimentation

Created 2 years ago

Updated 1 day ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Daniel Gross

Daniel Gross(Cofounder of Safe Superintelligence), and

46 more.

nanoGPT by karpathy

Minimalist repo for training/finetuning GPT models

Created 3 years ago

Updated 2 months ago

Feedback? Help us improve.