bertviz  by jessevig

Interactive tool for visualizing attention in Transformer language models

created 6 years ago
7,573 stars

Top 7.0% on sourcepulse

GitHubView on GitHub
Project Summary

BertViz is an interactive visualization tool for understanding attention mechanisms in Transformer-based NLP models like BERT, GPT-2, and BART. It offers multiple views (Head, Model, Neuron) to analyze attention patterns, aiding researchers and practitioners in debugging and interpreting model behavior.

How It Works

BertViz leverages the Tensor2Tensor visualization tool, extending it with distinct views to dissect attention. The Head View displays attention for specific heads within a layer, the Model View provides a global overview across all layers and heads, and the Neuron View visualizes individual neuron contributions to attention computation. This multi-faceted approach offers a comprehensive understanding of how attention distributes information.

Quick Start & Requirements

  • Install via pip: pip install bertviz
  • Requires Jupyter Notebook and ipywidgets: pip install jupyterlab ipywidgets
  • Supports Hugging Face models; requires output_attentions=True when loading models.
  • Colab integration is straightforward with !pip install bertviz.
  • Colab Tutorial
  • Documentation

Highlighted Details

  • Supports BERT, GPT-2, BART, T5, and other Hugging Face models.
  • Neuron View requires custom model versions included with BertViz for access to query/key vectors.
  • Visualizations can be returned as HTML objects for saving or custom embedding.
  • Supports visualizing attention for sentence pairs and encoder-decoder models.

Maintenance & Community

  • Developed by Jesse Vig.
  • Project is active and has been cited in research.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The tool may perform slowly with very long inputs or large models; filtering layers is recommended. Some Colab visualizations may fail with long inputs due to runtime disconnections. The Neuron View is limited to specific custom BERT, GPT-2, and RoBERTa models included with the tool. Attention visualization does not directly equate to prediction explanation.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
229 stars in the last 90 days

Explore Similar Projects

Starred by Dominik Moritz Dominik Moritz(Professor at CMU; ML Researcher at Apple), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
2 more.

ecco by jalammar

0%
2k
Python library for interactive NLP model visualization in Jupyter notebooks
created 4 years ago
updated 11 months ago
Starred by Anastasios Angelopoulos Anastasios Angelopoulos(Cofounder of LMArena), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

transformer-debugger by openai

0.1%
4k
Tool for language model behavior investigation
created 1 year ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
4 more.

vit-pytorch by lucidrains

0.2%
24k
PyTorch library for Vision Transformer variants and related techniques
created 4 years ago
updated 6 days ago
Feedback? Help us improve.