sparsify  by EleutherAI

Library for sparse autoencoders (SAEs) and transcoders on transformer activations

Created 1 year ago
621 stars

Top 53.1% on SourcePulse

GitHubView on GitHub
Project Summary

This library trains sparse autoencoders (SAEs) and transcoders on HuggingFace language model activations, following the "Scaling and evaluating sparse autoencoders" paper. It targets researchers and practitioners interested in understanding and manipulating LLM internals, offering a lean, on-the-fly activation computation approach for scalability with zero storage overhead.

How It Works

The library employs a TopK activation function to directly enforce sparsity, differing from L1 penalty methods. This approach is claimed to be a Pareto improvement. SAEs are trained on model activations, with options for custom hookpoints beyond the default residual stream. Transcoders are also supported for mapping between different SAE representations.

Quick Start & Requirements

  • Install: pip install eai-sparsify
  • Requirements: Python, HuggingFace transformers, datasets, torch. GPU with CUDA is highly recommended for training and inference.
  • Training: python -m sparsify <model_name> [dataset_name]
  • Loading Pretrained SAEs: Sae.load_from_hub("EleutherAI/sae-llama-3-8b-32x")
  • Documentation: https://github.com/EleutherAI/sparsify (Implicitly linked via GitHub repo)

Highlighted Details

  • Trains SAEs on model activations without disk caching, enabling large-scale training with zero storage overhead.
  • Supports custom hookpoint patterns for targeting specific model submodules (e.g., attention or MLP layers).
  • Offers distributed training capabilities, including module distribution across GPUs for memory efficiency.
  • Includes experimental features like linear k decay, GroupMax activation, and end-to-end training with CE or KL loss.

Maintenance & Community

  • Developed by EleutherAI.
  • Collaboration and discussion are encouraged in the sparse-autoencoders channel on the EleutherAI Discord server.

Licensing & Compatibility

  • The library itself appears to be MIT licensed based on typical EleutherAI projects, but the README does not explicitly state the license.
  • Compatible with HuggingFace transformers models.

Limitations & Caveats

The library currently lacks activation caching, making hyperparameter tuning slower. Fine-grained control over learning rates or latent counts per hookpoint is not supported; global settings are applied. Distributed training requires the number of GPUs to evenly divide the number of layers being trained.

Health Check
Last Commit

3 days ago

Responsiveness

1 week

Pull Requests (30d)
2
Issues (30d)
1
Star History
14 stars in the last 30 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
10 more.

adapters by adapter-hub

0.2%
3k
Unified library for parameter-efficient transfer learning in NLP
Created 5 years ago
Updated 1 month ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
26 more.

axolotl by axolotl-ai-cloud

0.5%
10k
CLI tool for streamlined post-training of AI models
Created 2 years ago
Updated 13 hours ago
Feedback? Help us improve.