chatbot-injections-exploits by Cranot

Chatbot exploit list for adversarial prompt engineering

Created 2 years ago

376 stars

Top 75.5% on SourcePulse

Project Summary

This repository serves as a curated collection of prompts and techniques designed to test and exploit vulnerabilities in large language models (LLMs) and chatbots. It targets AI researchers, security professionals, and developers seeking to understand and mitigate potential risks associated with chatbot interactions. The primary benefit is providing a practical resource for identifying and addressing prompt injection and social engineering vulnerabilities.

How It Works

The repository categorizes various attack vectors, including command injection keywords, emoji obfuscation, character encoding (ASCII, Hex, Base64, Unicode, etc.), zero-width characters, and social engineering tactics. These methods aim to bypass chatbot safety filters, elicit unintended responses, or manipulate the LLM's behavior by exploiting how it processes and interprets diverse input formats. The advantage lies in its comprehensive catalog of obfuscation techniques, offering a structured approach to discovering LLM weaknesses.

Quick Start & Requirements

Usage: Copy-paste provided prompts directly into chatbot interfaces.
Requirements: Access to a chatbot platform (e.g., ChatGPT). No specific software installation is required.
Resources: Minimal; requires only a web browser and an LLM service.
Links: GitHub Repository

Highlighted Details

Extensive list of emojis categorized by their potential impact on chatbot behavior (e.g., confusion, anger, affection).
Detailed examples of various character encoding techniques (ASCII, Hex, Base64, Unicode, UTF-7, UTF-8, URL, HTML) for obfuscating malicious prompts.
Demonstrations of zero-width character encoding to hide malicious payloads within seemingly innocuous text.
Social engineering examples illustrating how to indirectly guide chatbots into bypassing security measures.

Maintenance & Community

The project is a work in progress, with an invitation for community contributions via issues and pull requests. There are no specific mentions of maintainers, sponsorships, or a formal community channel like Discord or Slack.

Licensing & Compatibility

The repository does not explicitly state a license. This lack of a specified license may imply all rights are reserved or that it is not intended for redistribution or modification without explicit permission. Users should exercise caution regarding commercial use or integration into closed-source projects.

Limitations & Caveats

The effectiveness of these exploits can vary significantly between different LLM models and their specific versions or updates. The repository is a collection of examples, and successful execution often requires experimentation and adaptation. The lack of a formal license could pose legal or compatibility issues for certain use cases.

chatbot-injections-exploits by Cranot

Explore Similar Projects

aegis by automorphic-ai

prompt-injection-defenses by tldrsec

awesome-prompt-injection by Joe-B-Security

BrowserBruter by netsquare

PIPE by jthack

damn-vulnerable-llm-agent by ReversecLabs

rebuff by protectai

llm-security by greshake

offensive-ai-compilation by jiep

hackGPT by NoDataFound

llm-guard by protectai

PurpleLlama by meta-llama