repeng by vgel

Python library for representation engineering control vectors

Created 2 years ago

675 stars

Top 50.2% on SourcePulse

View on GitHub

4 Experts Love This Project

Jeremy Howard

Cofounder of fast.ai

Will Brown

Research Lead at Prime Intellect

Wing Lian

Founder of Axolotl AI

Travis Addair

Cofounder of Predibase

Project Summary

repeng is a Python library designed for representation engineering, enabling users to train and apply control vectors to large language models (LLMs) to steer their behavior. It targets researchers and developers looking to modify LLM outputs with minimal computational cost, offering a fast method to imbue models with specific stylistic or behavioral traits.

How It Works

repeng implements a method for training "control vectors" that can be applied to LLM weights. The library wraps Hugging Face transformers models, allowing for the injection of these learned vectors. Training involves creating datasets of paired, contrasting statements and using these to optimize a low-rank vector that, when added to specific layers' activations, modifies the model's output according to the desired persona or style. This approach is advantageous for its speed, with training reportedly taking under a minute.

Quick Start & Requirements

Install via pip: pip install repeng
Requires PyTorch and Hugging Face transformers.
Example notebooks may require accelerate: %pip install accelerate
Supports quantized models via export to GGUF for use with llama.cpp.
Official documentation and examples are available in the notebooks folder and a linked blog post.

Highlighted Details

Enables training of control vectors in under a minute.
Supports exporting trained vectors to GGUF format for use with llama.cpp.
Demonstrates steering model output towards specific personas (e.g., psychedelic vs. sober).

Maintenance & Community

The project is maintained by Theia Vogel. A CHANGELOG is available for version history.

Licensing & Compatibility

The code derives from andyzoujm/representation-engineering (MIT license). The project itself does not explicitly state a license, but the MIT license of its source material suggests broad compatibility.

Limitations & Caveats

Vector training does not currently work with Mixture-of-Experts (MoE) models like Mixtral. The library is in active development, and some example notebooks require manual installation of dependencies.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days