Awesome-Multimodal-Jailbreak by liuxuannan

Multimodal generative model jailbreaking: a survey

Created 1 year ago

254 stars

Top 99.1% on SourcePulse

Project Summary

Summary

This repository serves as a comprehensive survey of jailbreak attacks and defense mechanisms targeting multimodal generative models. It addresses the critical need for understanding and mitigating vulnerabilities in AI systems that process diverse data types like text, images, and audio. Aimed at researchers, engineers, and security professionals, it offers a structured overview of the evolving landscape, enabling rapid assessment of current threats and solutions in multimodal AI security.

How It Works

The project systematically categorizes multimodal jailbreak vulnerabilities and defenses across four distinct lifecycle levels: input, encoder, generator, and output. It provides a detailed taxonomy of attack methods and defense strategies, covering various input-output modalities such as Any-to-Text, Any-to-Vision, and Any-to-Any. This structured approach allows for a granular understanding of how attacks are formulated and how defenses can be implemented at different stages of the generative process.

Quick Start & Requirements

This repository functions as a curated collection of research papers and resources rather than a deployable software. Navigation is facilitated through the detailed table of contents, guiding users to specific sections on models, attacks, defenses, and evaluation. No specific installation or computational requirements are listed, as it is an informational resource.

Highlighted Details

Comprehensive taxonomy detailing the four levels of multimodal jailbreak (Input, Encoder, Generator, Output).
Extensive tables categorizing multimodal generative models (e.g., LLaVA, Stable Diffusion, GPT-4o) by modality and architecture.
A vast compilation of research papers on jailbreak attacks and defenses, with links to venues, dates, and code repositories where available.
Detailed sections on evaluation datasets and methodologies used in the field.

Maintenance & Community

The repository is described as "constantly updated" to ensure the inclusion of the most current information. Specific community channels or active maintenance team details are not provided.

Licensing & Compatibility

No specific open-source license or compatibility information is mentioned within the provided text.

Limitations & Caveats

As a survey, the repository is a snapshot of research and may not encompass all emerging threats or defenses. It focuses on academic and research resources, not on providing a ready-to-use security tool or framework.

Awesome-Multimodal-Jailbreak by liuxuannan

Explore Similar Projects

aegis by automorphic-ai

Awesome-LVLM-Attack by liudaizong

awesome-prompt-injection by Joe-B-Security

Prompt-Hacking-Resources by PromptLabs

jailbreakbench by JailbreakBench

EasyJailbreak by EasyJailbreak

Awesome-Jailbreak-on-LLMs by yueliu1999

HarmBench by centerforaisafety

Awesome-LM-SSP by CryptoAILab

Awesome-AI-Security by DeepSpaceHarbor

llm-attacks by llm-attacks

PurpleLlama by meta-llama