WeightWatcher  by CalculatedContent

DNN analyzer for predicting model accuracy without training/test data

created 6 years ago
1,630 stars

Top 26.3% on sourcepulse

GitHubView on GitHub
Project Summary

WeightWatcher is a diagnostic tool for analyzing Deep Neural Networks (DNNs) without requiring access to training or test data. It helps users predict model accuracy, identify over-training or over-parameterization, and detect potential issues during compression or fine-tuning, targeting researchers and practitioners in deep learning.

How It Works

The tool is based on theoretical research into Heavy-Tailed Self-Regularization (HT-SR) and employs concepts from Random Matrix Theory (RMT) and Statistical Mechanics. It analyzes the Empirical Spectral Density (ESD) of layer weight matrices, fitting the tail of the distribution to a power law to derive generalization metrics. This approach aims to quantify how "on-random" or "heavy-tailed" a layer's weight distribution is, correlating these properties with model generalization performance.

Quick Start & Requirements

  • Install via pip: pip install weightwatcher
  • Requires Python 3.7+, PyTorch 1.x, and TensorFlow 2.x/Keras.
  • Supported layers include Dense/Linear, Fully Connected, Conv1D, and Conv2D.
  • Official blog and example notebooks are available for deeper dives.

Highlighted Details

  • Predicts test accuracies and identifies over/under-training without data.
  • Analyzes PEFT/LoRA models by examining delta layers.
  • Detects "Correlation Traps" indicative of poor training via randomized ESD comparison.
  • Offers metrics like alpha (power law exponent) and rand_distance for generalization assessment.

Maintenance & Community

  • Developed by Charles H. Martin (Calculation Consulting) and contributors.
  • Active Discord server for community support and discussion.
  • Numerous academic papers and presentations detail the underlying research.

Licensing & Compatibility

  • Licensed under the Apache License 2.0.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The power law fits are most effective for well-trained, heavy-tailed ESDs; results may be spurious if alpha > 8.0 or if ESDs are multimodal or not well-described by a single power law. The PEFT/LoRA analysis is experimental and currently ignores biases.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
51 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

SWE-Gym by SWE-Gym

1.0%
513
Environment for training software engineering agents
created 9 months ago
updated 4 days ago
Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
1 more.

weak-to-strong by openai

0%
3k
Weak-to-strong generalization research paper implementation
created 1 year ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Luca Antiga Luca Antiga(CTO of Lightning AI), and
4 more.

helm by stanford-crfm

0.9%
2k
Open-source Python framework for holistic evaluation of foundation models
created 3 years ago
updated 1 day ago
Feedback? Help us improve.