Discover and explore top open-source AI tools and projects—updated daily.
Fast model editing for LLMs
Top 99.8% on SourcePulse
MEND (Model Editing Networks using Gradient Decomposition) offers a method for efficiently editing large language models at scale. It targets researchers and practitioners needing to modify model behavior without full retraining, providing a faster alternative for knowledge injection or correction.
How It Works
The project implements model editing via gradient decomposition, enabling targeted modifications to model parameters. It supports various algorithms (MEND, EFK, ENN) and experiments, including text generation (gen
), fact-checking (fc
), and question-answering (qa
), accommodating different model architectures like GPT, seq2seq, and BERT.
Quick Start & Requirements
python -m venv env
, source env/bin/activate
) and install dependencies (pip install -r requirements.txt
).mend/data
directory.python -m run +alg=mend +experiment=gen +model=distilgpt2 data.wiki_webtext=False
.data.wiki_webtext
, data.zsre_nq
) may need adjustment based on the model and experiment. Multi-edit experiments require careful batch size configuration (e.g., data.n_edits=5 batch_size=6
).fc
and qa
experiments, respectively, sourced from De Cao et al.Highlighted Details
Maintenance & Community
eric.mitchell@cs.stanford.edu
). No community forums, sponsorships, or active development signals are present in the README.Licensing & Compatibility
Limitations & Caveats
gen
, seq2seq for qa
, BERT for fc
).2 years ago
Inactive