prompt-hacker-collections by yunwei37

LLM prompt hacking and defense resource

Created 2 years ago

291 stars

Top 90.9% on SourcePulse

Project Summary

Summary

This repository provides a curated collection of prompt adversarial techniques, focusing on prompt injection attacks, defenses, and reverse engineering examples. It serves as a valuable resource for researchers, students, and security professionals seeking to understand and practice LLM security vulnerabilities and mitigation strategies.

How It Works

The project organizes various prompt types, including jailbreaks, reverse engineering, attacks, and defenses, primarily in YAML format for easy parsing and application. It details concepts and provides concrete examples, such as the "DAN 11.0" jailbreak prompt, illustrating methods to bypass LLM restrictions and explore their security boundaries.

Quick Start & Requirements

This repository primarily serves as a reference collection rather than a runnable application. Installation instructions or specific software/hardware prerequisites are not detailed. Users are expected to leverage the provided prompt examples directly within their LLM interaction environments. Relevant external resources on LLM safety are linked.

Highlighted Details

Comprehensive categories: Prompt reverse engineering, jailbreaking, attacks, and defense strategies.
Structured data: Prompt examples are organized in YAML format for programmatic use.
In-depth examples: Features detailed jailbreak prompts like "DAN 11.0" and reverse engineering techniques for models like Notion AI and Midjourney.
External resources: Links to OpenAI's safety best practices and Microsoft's LLM red-teaming guides.

Maintenance & Community

The project encourages community contributions through issues and pull requests, fostering collaborative development. Specific details regarding active maintainers, community channels (like Discord/Slack),

prompt-hacker-collections by yunwei37

Explore Similar Projects

llm-security by dropbox

Learn-Prompt-Hacking by TrustAI-laboratory

camel-prompt-injection by google-research

HouYi by LLMSecurity

PromptInject by agencyenterprise

ps-fuzz by prompt-security

awesome-gpt-security by cckuailong

rebuff by protectai

damn-vulnerable-MCP-server by harishsg993010

llm-guard by protectai

prompt-engineering by brexhq

PurpleLlama by meta-llama