Discover and explore top open-source AI tools and projects—updated daily.
google-researchBenchmark for training data extraction attacks on language models
Top 89.4% on SourcePulse
This repository hosts a challenge focused on improving targeted data extraction attacks against large language models. It targets researchers and engineers interested in understanding and mitigating privacy risks associated with model memorization, offering a benchmark dataset and evaluation framework.
How It Works
The challenge centers on targeted data extraction, where participants are given a prefix and must predict a specific continuation (suffix) that was present in the model's training data. This approach is favored for its security relevance and ease of evaluation compared to untargeted attacks. The benchmark uses a subset of 20,000 examples from The Pile dataset, specifically designed for extractability and well-defined continuations.
Quick Start & Requirements
load_dataset.py script can generate training data from The Pile dataset using provided CSV pointers.detailed_description.pdf.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 days ago
1 week
lm-sys
XueFuzhao
ConnorJL
togethercomputer