Awesome-Jailbreak-on-LLMs  by yueliu1999

Collection of jailbreak methods on LLMs

created 1 year ago
821 stars

Top 44.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a comprehensive, curated collection of state-of-the-art research on jailbreaking Large Language Models (LLMs). It targets researchers, security engineers, and practitioners interested in understanding and mitigating vulnerabilities in LLMs, providing a valuable resource for advancing LLM safety and security.

How It Works

The collection categorizes jailbreak methods into attack types (e.g., black-box, white-box, multi-turn, multimodal) and defense strategies (learning-based, strategy-based, guard models). It compiles relevant papers, code repositories, datasets, and evaluation methodologies, offering a structured overview of the evolving landscape of LLM security research.

Quick Start & Requirements

This repository is a curated list of research papers and code. There is no direct installation or execution command. Users are directed to individual paper links and associated code repositories for specific implementations.

Highlighted Details

  • Extensive coverage of recent (2023-2025) jailbreak techniques across various attack vectors.
  • Includes dedicated sections for defense strategies, guard models, and evaluation benchmarks.
  • Provides direct links to papers and code for most listed methods.
  • Features a categorized list of related "Awesome" repositories for further exploration.

Maintenance & Community

The repository is maintained by yueliu1999, with contributions welcomed via PRs and issues. Contact is available via email for specific inquiries. The project encourages citation of its featured papers.

Licensing & Compatibility

The repository itself is a collection of links and does not impose a specific license. Individual linked papers and code repositories will have their own respective licenses, which users must adhere to.

Limitations & Caveats

This is a curated list and does not provide a unified framework or tool for performing jailbreaks or defenses. Users must navigate to individual resources for implementation details and potential dependencies. The rapid pace of LLM research means the content may require frequent updates to remain fully comprehensive.

Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
175 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.