pythia  by EleutherAI

LLM suite for interpretability, learning dynamics, ethics, and transparency research

created 3 years ago
2,575 stars

Top 18.6% on sourcepulse

GitHubView on GitHub
Project Summary

The Pythia suite provides a comprehensive set of autoregressive transformer models, ranging from 14M to 12B parameters, specifically designed for interpretability research. It offers 154 checkpoints per model, enabling detailed analysis of learning dynamics and knowledge evolution during training. The suite is ideal for researchers focused on understanding LLM internals, training stability, and ethical considerations.

How It Works

Pythia models are trained on the Pile dataset (or its deduplicated version) with consistent data ordering and training procedures across all sizes. This uniformity allows for direct comparison and causal analysis of how scale and training dynamics influence model behavior. The availability of numerous intermediate checkpoints is a key differentiator, facilitating fine-grained studies of emergent properties and internal representations.

Quick Start & Requirements

  • Install/Run: Models can be loaded via Hugging Face Transformers:
    from transformers import GPTNeoXForCausalLM, AutoTokenizer
    model = GPTNeoXForCausalLM.from_pretrained("EleutherAI/pythia-70m-deduped", revision="step3000")
    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-70m-deduped", revision="step3000")
    
  • Prerequisites: PyTorch, Hugging Face libraries. Reproducing training requires the GPT-NeoX library, Docker, and significant disk space for datasets.
  • Resources: Loading models requires standard GPU memory. Full training reproduction is resource-intensive.
  • Links: Pythia Paper, Hugging Face Hub, LM Evaluation Harness

Highlighted Details

  • 10 model sizes (14M to 12B parameters) trained on the Pile dataset.
  • 154 checkpoints available for each model, enabling fine-grained temporal analysis.
  • Models trained with identical data order across all sizes for direct comparison.
  • Includes "v0" models with minor inconsistencies for ablation studies.

Maintenance & Community

The project is actively maintained by EleutherAI, a prominent research collective in the LLM space. Related research papers are frequently added. Community interaction is primarily through GitHub issues and discussions.

Licensing & Compatibility

All code and models are released under the Apache License 2.0, permitting commercial use and integration into closed-source projects.

Limitations & Caveats

The README notes that evaluation benchmarks were run with an older version of the LM Evaluation Harness and may not be reproducible with current versions. Some older "v0" models have minor inconsistencies.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
111 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.