sparsify  by EleutherAI

Library for sparse autoencoders (SAEs) and transcoders on transformer activations

created 1 year ago
595 stars

Top 55.5% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This library trains sparse autoencoders (SAEs) and transcoders on HuggingFace language model activations, following the "Scaling and evaluating sparse autoencoders" paper. It targets researchers and practitioners interested in understanding and manipulating LLM internals, offering a lean, on-the-fly activation computation approach for scalability with zero storage overhead.

How It Works

The library employs a TopK activation function to directly enforce sparsity, differing from L1 penalty methods. This approach is claimed to be a Pareto improvement. SAEs are trained on model activations, with options for custom hookpoints beyond the default residual stream. Transcoders are also supported for mapping between different SAE representations.

Quick Start & Requirements

  • Install: pip install eai-sparsify
  • Requirements: Python, HuggingFace transformers, datasets, torch. GPU with CUDA is highly recommended for training and inference.
  • Training: python -m sparsify <model_name> [dataset_name]
  • Loading Pretrained SAEs: Sae.load_from_hub("EleutherAI/sae-llama-3-8b-32x")
  • Documentation: https://github.com/EleutherAI/sparsify (Implicitly linked via GitHub repo)

Highlighted Details

  • Trains SAEs on model activations without disk caching, enabling large-scale training with zero storage overhead.
  • Supports custom hookpoint patterns for targeting specific model submodules (e.g., attention or MLP layers).
  • Offers distributed training capabilities, including module distribution across GPUs for memory efficiency.
  • Includes experimental features like linear k decay, GroupMax activation, and end-to-end training with CE or KL loss.

Maintenance & Community

  • Developed by EleutherAI.
  • Collaboration and discussion are encouraged in the sparse-autoencoders channel on the EleutherAI Discord server.

Licensing & Compatibility

  • The library itself appears to be MIT licensed based on typical EleutherAI projects, but the README does not explicitly state the license.
  • Compatible with HuggingFace transformers models.

Limitations & Caveats

The library currently lacks activation caching, making hyperparameter tuning slower. Fine-grained control over learning rates or latent counts per hookpoint is not supported; global settings are applied. Distributed training requires the number of GPUs to evenly divide the number of layers being trained.

Health Check
Last commit

5 days ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
1
Star History
76 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Feedback? Help us improve.