G.pt by wpeebles

Research paper implementation for loss-conditional diffusion models of neural network parameters

Created 3 years ago

345 stars

Top 80.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Anastasios Angelopoulos

Cofounder of LMArena

Project Summary

This repository provides the official PyTorch implementation for "Learning to Learn with Generative Models of Neural Network Checkpoints." It enables users to train and evaluate diffusion models that generate updated neural network parameters, allowing for one-step optimization from random initialization. The target audience includes researchers and practitioners interested in meta-learning and efficient neural network optimization.

How It Works

The core of G.pt is a transformer model operating on sequences of neural network parameters, trained as a diffusion model directly in parameter space. It leverages minimal domain-specific inductive biases, similar to Vision Transformers. The model is conditioned on a starting parameter vector, initial loss, and a target loss, enabling it to sample updated parameters that achieve the desired outcome. This approach allows for rapid optimization by directly generating a "good" checkpoint.

Quick Start & Requirements

Install via pip install -e . or use the provided environment.yml for Conda.
Requires Python 3.8+ (specifically for IsaacGym RL simulator).
Weights & Biases account recommended for visualization.
IsaacGym installation is required for RL tasks.
Pre-trained models and datasets can be downloaded via python Gpt/download.py.

Highlighted Details

Features five pre-trained DDPM Transformers for vision and RL tasks.
Includes a dataset of over 23 million neural network checkpoints.
Offers scripts for training, evaluation, and visualization.
Supports adding new tasks by defining task-specific functions and model constructors.

Maintenance & Community

The project is associated with the University of California, Berkeley. The codebase borrows from OpenAI's diffusion repos and Andrej Karpathy's minGPT.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README notes potential compatibility issues with Python versions newer than 3.8 for the IsaacGym RL simulator. The project appears to be research-oriented, and stability for production use is not guaranteed.

G.pt by wpeebles

Explore Similar Projects

FreeDoM by yujiwen

DiT-MoE by feizc

pytorch-es by atgambardella

Muon by KellerJordan

supervised-reptile by openai

RAdam by LiyuanLucasLiu

Lhy_Machine_Learning by Fafa-DL

nn-zero-to-hero by karpathy

ControlNet by lllyasviel

leedl-tutorial by datawhalechina

Megatron-LM by NVIDIA

numpy-ml by ddbourgin