Model editing research paper for GPT-2 and GPT-J
Top 52.1% on sourcepulse
This repository provides an implementation of Rank-One Model Editing (ROME) for efficiently locating and editing factual associations within large auto-regressive transformer models. It is targeted at researchers and practitioners in NLP and AI safety who need to modify factual knowledge in pre-trained language models without full retraining.
How It Works
ROME operates by identifying and modifying a low-rank update to the model's weight matrices that specifically targets factual associations. This approach leverages causal tracing to pinpoint the relevant components within the transformer's layers and then applies a rank-one update, making the editing process computationally efficient and precise.
Quick Start & Requirements
bash ./scripts/setup_conda.sh
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
Inactive