chatbot-injections-exploits  by Cranot

Chatbot exploit list for adversarial prompt engineering

created 2 years ago
358 stars

Top 79.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of prompts and techniques designed to test and exploit vulnerabilities in large language models (LLMs) and chatbots. It targets AI researchers, security professionals, and developers seeking to understand and mitigate potential risks associated with chatbot interactions. The primary benefit is providing a practical resource for identifying and addressing prompt injection and social engineering vulnerabilities.

How It Works

The repository categorizes various attack vectors, including command injection keywords, emoji obfuscation, character encoding (ASCII, Hex, Base64, Unicode, etc.), zero-width characters, and social engineering tactics. These methods aim to bypass chatbot safety filters, elicit unintended responses, or manipulate the LLM's behavior by exploiting how it processes and interprets diverse input formats. The advantage lies in its comprehensive catalog of obfuscation techniques, offering a structured approach to discovering LLM weaknesses.

Quick Start & Requirements

  • Usage: Copy-paste provided prompts directly into chatbot interfaces.
  • Requirements: Access to a chatbot platform (e.g., ChatGPT). No specific software installation is required.
  • Resources: Minimal; requires only a web browser and an LLM service.
  • Links: GitHub Repository

Highlighted Details

  • Extensive list of emojis categorized by their potential impact on chatbot behavior (e.g., confusion, anger, affection).
  • Detailed examples of various character encoding techniques (ASCII, Hex, Base64, Unicode, UTF-7, UTF-8, URL, HTML) for obfuscating malicious prompts.
  • Demonstrations of zero-width character encoding to hide malicious payloads within seemingly innocuous text.
  • Social engineering examples illustrating how to indirectly guide chatbots into bypassing security measures.

Maintenance & Community

The project is a work in progress, with an invitation for community contributions via issues and pull requests. There are no specific mentions of maintainers, sponsorships, or a formal community channel like Discord or Slack.

Licensing & Compatibility

The repository does not explicitly state a license. This lack of a specified license may imply all rights are reserved or that it is not intended for redistribution or modification without explicit permission. Users should exercise caution regarding commercial use or integration into closed-source projects.

Limitations & Caveats

The effectiveness of these exploits can vary significantly between different LLM models and their specific versions or updates. The repository is a collection of examples, and successful execution often requires experimentation and adaptation. The lack of a formal license could pose legal or compatibility issues for certain use cases.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
2 more.

llm-security by greshake

0.1%
2k
Research paper on indirect prompt injection attacks targeting app-integrated LLMs
created 2 years ago
updated 2 weeks ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Pliny the Liberator Pliny the Liberator(Founder of Pliny).

L1B3RT4S by elder-plinius

1.0%
10k
AI jailbreak prompts
created 1 year ago
updated 1 week ago
Feedback? Help us improve.