bertviz  by jessevig

Interactive tool for visualizing attention in Transformer language models

Created 6 years ago
7,642 stars

Top 6.8% on SourcePulse

GitHubView on GitHub
Project Summary

BertViz is an interactive visualization tool for understanding attention mechanisms in Transformer-based NLP models like BERT, GPT-2, and BART. It offers multiple views (Head, Model, Neuron) to analyze attention patterns, aiding researchers and practitioners in debugging and interpreting model behavior.

How It Works

BertViz leverages the Tensor2Tensor visualization tool, extending it with distinct views to dissect attention. The Head View displays attention for specific heads within a layer, the Model View provides a global overview across all layers and heads, and the Neuron View visualizes individual neuron contributions to attention computation. This multi-faceted approach offers a comprehensive understanding of how attention distributes information.

Quick Start & Requirements

  • Install via pip: pip install bertviz
  • Requires Jupyter Notebook and ipywidgets: pip install jupyterlab ipywidgets
  • Supports Hugging Face models; requires output_attentions=True when loading models.
  • Colab integration is straightforward with !pip install bertviz.
  • Colab Tutorial
  • Documentation

Highlighted Details

  • Supports BERT, GPT-2, BART, T5, and other Hugging Face models.
  • Neuron View requires custom model versions included with BertViz for access to query/key vectors.
  • Visualizations can be returned as HTML objects for saving or custom embedding.
  • Supports visualizing attention for sentence pairs and encoder-decoder models.

Maintenance & Community

  • Developed by Jesse Vig.
  • Project is active and has been cited in research.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The tool may perform slowly with very long inputs or large models; filtering layers is recommended. Some Colab visualizations may fail with long inputs due to runtime disconnections. The Neuron View is limited to specific custom BERT, GPT-2, and RoBERTa models included with the tool. Attention visualization does not directly equate to prediction explanation.

Health Check
Last Commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
44 stars in the last 30 days

Explore Similar Projects

Starred by Anastasios Angelopoulos Anastasios Angelopoulos(Cofounder of LMArena), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

transformer-debugger by openai

0.1%
4k
Tool for language model behavior investigation
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.