Python library for representation engineering control vectors
Top 54.2% on sourcepulse
repeng is a Python library designed for representation engineering, enabling users to train and apply control vectors to large language models (LLMs) to steer their behavior. It targets researchers and developers looking to modify LLM outputs with minimal computational cost, offering a fast method to imbue models with specific stylistic or behavioral traits.
How It Works
repeng implements a method for training "control vectors" that can be applied to LLM weights. The library wraps Hugging Face transformers
models, allowing for the injection of these learned vectors. Training involves creating datasets of paired, contrasting statements and using these to optimize a low-rank vector that, when added to specific layers' activations, modifies the model's output according to the desired persona or style. This approach is advantageous for its speed, with training reportedly taking under a minute.
Quick Start & Requirements
pip install repeng
transformers
.accelerate
: %pip install accelerate
llama.cpp
.notebooks
folder and a linked blog post.Highlighted Details
llama.cpp
.Maintenance & Community
The project is maintained by Theia Vogel. A CHANGELOG is available for version history.
Licensing & Compatibility
The code derives from andyzoujm/representation-engineering (MIT license). The project itself does not explicitly state a license, but the MIT license of its source material suggests broad compatibility.
Limitations & Caveats
Vector training does not currently work with Mixture-of-Experts (MoE) models like Mixtral. The library is in active development, and some example notebooks require manual installation of dependencies.
6 months ago
1+ week