TextAttack  by QData

Python framework for NLP adversarial attacks, data augmentation, and model training

created 5 years ago
3,229 stars

Top 15.3% on sourcepulse

GitHubView on GitHub
Project Summary

TextAttack is a comprehensive Python framework designed for researchers and practitioners in Natural Language Processing (NLP) to generate adversarial examples, augment datasets, and train NLP models. It provides a unified interface for understanding, developing, and benchmarking various adversarial attack methods against NLP models, enhancing model robustness and interpretability.

How It Works

TextAttack modularizes adversarial attacks into four key components: Goal Functions (defining attack success), Constraints (validating perturbations), Transformations (generating modifications), and Search Methods (navigating the perturbation space). This design allows for the assembly of existing attacks from literature and the creation of novel ones by combining these components, enabling model-agnostic analysis of any NLP model that can process string inputs.

Quick Start & Requirements

Highlighted Details

  • Supports 16+ adversarial attack recipes from academic literature.
  • Model-agnostic, compatible with models from any deep learning framework.
  • Includes built-in support for Hugging Face Transformers models and datasets.
  • Offers command-line and Python interfaces for attacks, data augmentation, and model training.

Maintenance & Community

  • Active development, currently in an "alpha" stage.
  • Join the TextAttack Slack channel for updates and help.
  • Contribution guidelines are available in CONTRIBUTING.md.

Licensing & Compatibility

  • The project does not explicitly state a license in the README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

  • The project is in an "alpha" stage, indicating potential for ongoing changes and instability.
  • The README notes that direct comparison of attack recipes without constant constraint spaces can be misleading, and emphasizes the need for careful consideration of adversarial example quality, particularly regarding semantic preservation and grammaticality.
Health Check
Last commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
0
Star History
80 stars in the last 90 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

llm-attacks by llm-attacks

0.4%
4k
Attack framework for aligned LLMs, based on a research paper
created 2 years ago
updated 1 year ago
Feedback? Help us improve.