Discover and explore top open-source AI tools and projects—updated daily.
PyTorch implementation of the Greedy Coordinate Gradient (GCG) algorithm
Top 92.1% on SourcePulse
This repository provides nanoGCG, a fast and lightweight PyTorch implementation of the Greedy Coordinate Gradient (GCG) algorithm. It enables users to optimize adversarial strings for causal Hugging Face language models, offering advanced features for enhanced performance and flexibility in prompt engineering.
How It Works
nanoGCG implements the GCG algorithm, which iteratively optimizes a target string by making small, greedy changes to tokens. It supports several enhancements over the original algorithm, including multi-position token swapping, a historical attack buffer, the mellowmax loss function, and probe sampling. Probe sampling, in particular, accelerates optimization by using a smaller draft model to pre-filter candidate prompts, potentially achieving significant speedups.
Quick Start & Requirements
pip install nanogcg
Highlighted Details
Maintenance & Community
The project is associated with its authors and the GCG algorithm's foundational research. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
Licensed under the MIT license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
While the implementation is lightweight, running GCG attacks can be computationally intensive, requiring significant GPU resources for larger models and longer optimization steps. The effectiveness of probe sampling depends on the choice of the draft model.
4 months ago
1 week