OpenAttack  by thunlp

Text attack toolkit for evaluating & improving NLP model robustness

created 5 years ago
739 stars

Top 47.9% on sourcepulse

GitHubView on GitHub
Project Summary

OpenAttack is a comprehensive Python toolkit designed for generating and evaluating textual adversarial attacks against NLP models. It caters to researchers and practitioners aiming to assess model robustness, develop new attack strategies, or implement adversarial training. The package streamlines the entire adversarial attack pipeline, from text preprocessing to victim model interaction and result evaluation.

How It Works

OpenAttack employs a modular architecture, separating concerns into TextProcessor, Victim, Attacker, AttackAssist, Metric, AttackEval, and DataManager. This design facilitates extensibility, allowing users to easily integrate custom datasets, victim models, or attack algorithms. It supports various attack types (sentence, word, character level; gradient, score, decision, blind) and offers parallel processing for improved efficiency. The toolkit is tightly integrated with Hugging Face's Transformers and Datasets libraries, simplifying the use of pre-trained models and datasets.

Quick Start & Requirements

  • Install via pip: pip install OpenAttack
  • Clone repo and install: git clone https://github.com/thunlp/OpenAttack.git && cd OpenAttack && python setup.py install
  • Requires Python.
  • Demo available: python demo.py
  • Examples and documentation: README

Highlighted Details

  • Supports 15 attack models, covering sentence, word, and character-level perturbations.
  • Offers multilinguality (English and Chinese) with an extensible design for more languages.
  • Fully compatible with 🤗 Hugging Face Transformers and Datasets.
  • Includes built-in victim models (e.g., BERT, RoBERTa) and supports custom victim models and datasets.

Maintenance & Community

  • Developed by THUNLP.
  • Contributions are welcomed.

Licensing & Compatibility

  • The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project's license is not clearly stated in the README, which may pose a barrier for commercial adoption or use in closed-source projects.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
4 more.

argilla by argilla-io

0.4%
5k
Collaboration tool for building high-quality AI datasets
created 4 years ago
updated 5 days ago
Feedback? Help us improve.