nnsight  by ndif-team

SDK for interpreting/manipulating deep model internals

Created 1 year ago
663 stars

Top 50.6% on SourcePulse

GitHubView on GitHub
Project Summary

This package provides a Python API for interpreting and manipulating the internal states of deep learning models, particularly large language models. It targets researchers and developers who need to understand, debug, or modify model behavior at a granular level, offering a powerful tool for mechanistic interpretability.

How It Works

nnsight operates by creating a computational graph of model operations within a tracing context. Users define interventions or data extraction points using proxy objects that represent model outputs or intermediate states. These proxies are then compiled into an executable graph, allowing for efficient execution and modification of model forward passes. This approach enables fine-grained control and observation without requiring direct modification of the underlying model code.

Quick Start & Requirements

  • Install via pip: pip install nnsight
  • Requires Python and PyTorch. GPU with CUDA is recommended for performance.
  • Example usage and detailed documentation are available at nnsight.net.

Highlighted Details

  • Enables direct manipulation of model activations (e.g., adding noise).
  • Supports multi-token generation with per-token intervention.
  • Allows cross-prompt interventions by reusing computed states.
  • Facilitates ad-hoc module application and chaining.

Maintenance & Community

The project is associated with the nndif-team and has a published paper. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The license is not explicitly stated in the README. Compatibility with commercial or closed-source projects would depend on the specific license terms.

Limitations & Caveats

The README focuses on demonstrating capabilities with GPT-2. Support for other model architectures or frameworks may vary. The library is relatively new, and extensive community support or long-term maintenance guarantees are not detailed.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
12
Issues (30d)
15
Star History
35 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
2 more.

rome by kmeng01

0.1%
668
Model editing research paper for GPT-2 and GPT-J
Created 3 years ago
Updated 1 year ago
Starred by Anastasios Angelopoulos Anastasios Angelopoulos(Cofounder of LMArena), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

transformer-debugger by openai

0.1%
4k
Tool for language model behavior investigation
Created 1 year ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Gabriel Almeida Gabriel Almeida(Cofounder of Langflow), and
5 more.

lit by PAIR-code

0.1%
4k
Interactive ML model analysis tool for understanding model behavior
Created 5 years ago
Updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Neel Nanda Neel Nanda(Research Scientist at Google DeepMind), and
1 more.

TransformerLens by TransformerLensOrg

1.0%
3k
Library for mechanistic interpretability research on GPT-style language models
Created 3 years ago
Updated 1 day ago
Feedback? Help us improve.