nnsight by ndif-team

SDK for interpreting/manipulating deep model internals

Created 2 years ago

760 stars

Top 45.9% on SourcePulse

View on GitHub

4 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Stella Rose Biderman

Executive Director at EleutherAI

Neel Nanda

Research Scientist at Google DeepMind

Travis Fischer

Founder of Agentic

Project Summary

This package provides a Python API for interpreting and manipulating the internal states of deep learning models, particularly large language models. It targets researchers and developers who need to understand, debug, or modify model behavior at a granular level, offering a powerful tool for mechanistic interpretability.

How It Works

nnsight operates by creating a computational graph of model operations within a tracing context. Users define interventions or data extraction points using proxy objects that represent model outputs or intermediate states. These proxies are then compiled into an executable graph, allowing for efficient execution and modification of model forward passes. This approach enables fine-grained control and observation without requiring direct modification of the underlying model code.

Quick Start & Requirements

Install via pip: pip install nnsight
Requires Python and PyTorch. GPU with CUDA is recommended for performance.
Example usage and detailed documentation are available at nnsight.net.

Highlighted Details

Enables direct manipulation of model activations (e.g., adding noise).
Supports multi-token generation with per-token intervention.
Allows cross-prompt interventions by reusing computed states.
Facilitates ad-hoc module application and chaining.

Maintenance & Community

The project is associated with the nndif-team and has a published paper. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The license is not explicitly stated in the README. Compatibility with commercial or closed-source projects would depend on the specific license terms.

Limitations & Caveats

The README focuses on demonstrating capabilities with GPT-2. Support for other model architectures or frameworks may vary. The library is relatively new, and extensive community support or long-term maintenance guarantees are not detailed.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

36 stars in the last 30 days