transformer-debugger  by openai

Tool for language model behavior investigation

created 1 year ago
4,087 stars

Top 12.2% on sourcepulse

GitHubView on GitHub
Project Summary

The Transformer Debugger (TDB) is a tool from OpenAI's Superalignment team designed for investigating the behavior of small language models. It aids researchers and engineers in understanding "why" a model produces specific outputs by combining automated interpretability techniques with sparse autoencoders, enabling rapid, code-free exploration and intervention.

How It Works

TDB facilitates deep dives into model internals by identifying key components like neurons, attention heads, and autoencoder latents that drive specific behaviors. It automatically generates explanations for component activation and traces connections to reveal underlying circuits. This approach allows users to pinpoint causal relationships between model parts and observable outputs, answering questions about token prediction or attention patterns.

Quick Start & Requirements

  • Install via pip install -e . after cloning the repository.
  • Requires Python and Node.js/npm.
  • Setup involves installing the activation server backend and neuron viewer frontend separately.
  • See official documentation for detailed setup and usage.

Highlighted Details

  • Integrates automated interpretability with sparse autoencoders.
  • Enables intervention in the forward pass to observe behavioral effects.
  • Provides a React-based Neuron viewer for exploring model components.
  • Includes an activation server for inference and data serving.

Maintenance & Community

  • Developed by OpenAI's Superalignment team.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The tool is primarily focused on "small language models" and its applicability to larger, more complex architectures is not detailed. The README also lacks explicit licensing information, which may impact commercial adoption.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Carol Willing Carol Willing(Core Contributor to CPython, Jupyter).

genai by rgbkrk

0%
352
IPython extension for generative AI assistance in Jupyter notebooks
created 3 years ago
updated 1 year ago
Starred by Dominik Moritz Dominik Moritz(Professor at CMU; ML Researcher at Apple), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
2 more.

ecco by jalammar

0%
2k
Python library for interactive NLP model visualization in Jupyter notebooks
created 4 years ago
updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

lit by PAIR-code

0.0%
4k
Interactive ML model analysis tool for understanding model behavior
created 5 years ago
updated 5 days ago
Feedback? Help us improve.