Research paper implementation for loss-conditional diffusion models of neural network parameters
Top 81.9% on sourcepulse
This repository provides the official PyTorch implementation for "Learning to Learn with Generative Models of Neural Network Checkpoints." It enables users to train and evaluate diffusion models that generate updated neural network parameters, allowing for one-step optimization from random initialization. The target audience includes researchers and practitioners interested in meta-learning and efficient neural network optimization.
How It Works
The core of G.pt is a transformer model operating on sequences of neural network parameters, trained as a diffusion model directly in parameter space. It leverages minimal domain-specific inductive biases, similar to Vision Transformers. The model is conditioned on a starting parameter vector, initial loss, and a target loss, enabling it to sample updated parameters that achieve the desired outcome. This approach allows for rapid optimization by directly generating a "good" checkpoint.
Quick Start & Requirements
pip install -e .
or use the provided environment.yml
for Conda.python Gpt/download.py
.Highlighted Details
Maintenance & Community
The project is associated with the University of California, Berkeley. The codebase borrows from OpenAI's diffusion repos and Andrej Karpathy's minGPT.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The README notes potential compatibility issues with Python versions newer than 3.8 for the IsaacGym RL simulator. The project appears to be research-oriented, and stability for production use is not guaranteed.
2 years ago
1 day