OpenRT  by AI45Lab

Open-source framework for multimodal LLM red teaming

Created 4 months ago
251 stars

Top 99.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

OpenRT is an open-source red teaming framework designed to systematically test the safety and robustness of Multimodal Large Language Models (MLLMs). It provides researchers and engineers with a comprehensive suite of over 40 attack methods, enabling the discovery of vulnerabilities and the enhancement of LLM defenses against adversarial inputs.

How It Works

The framework employs a modular, plugin-based architecture, allowing for flexible composition of components. It supports a wide array of attack strategies, encompassing both black-box and white-box methodologies, and extends capabilities to multimodal inputs, including text and images. Experiments are driven by YAML configuration files, streamlining the definition and execution of red teaming scenarios, while integrated evaluation mechanisms, including keyword matching and LLM judges, provide quantitative and qualitative assessments.

Quick Start & Requirements

Installation can be performed via PyPI (pip install openrt) or from source (git clone https://github.com/AI45Lab/OpenRT.git, cd OpenRT, pip install -e .). Users must configure API keys for LLM interactions, typically via environment variables (export OPENAI_API_KEY="..."). The framework supports running individual attack examples or full experiments using YAML configurations. Batch evaluations can be executed via the eval.py script with extensive command-line arguments for customization. Official documentation and demos are available via the project page 🌐 Project Page.

Highlighted Details

  • Features 42+ distinct attack methods, categorized into black-box (optimization, fuzzing, LLM-driven refinement, linguistic, contextual deception, multimodal, multi-agent) and white-box approaches.
  • Native support for multimodal LLMs, enabling attacks via text and image vectors.
  • A highly extensible, plugin-based architecture facilitates the addition of new attack methods and components.
  • Configuration-driven experiment setup and advanced batch evaluation capabilities via YAML and CLI.

Maintenance & Community

The README does not detail specific maintenance schedules, notable contributors, sponsorships, or community channels such as Discord or Slack. Primary community engagement points are the project page and the arXiv preprint arXiv.

Licensing & Compatibility

OpenRT is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) License. This is a strong copyleft license that requires any modifications or derivative works to be made available under the same license, which may impose restrictions on integration into proprietary software or commercial closed-source products.

Limitations & Caveats

The provided README does not explicitly mention any limitations, known bugs, alpha status, or unsupported platforms. The framework is presented as a robust, released tool.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.