awesome-prompt-injection  by FonduAI

Resource list for prompt injection attacks on ML models

created 2 years ago
314 stars

Top 87.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of resources for understanding and mitigating prompt injection vulnerabilities in machine learning models, particularly those employing prompt-based learning. It targets AI researchers, security engineers, and developers working with LLMs, offering a centralized hub for articles, tutorials, research papers, and tools to combat this emerging threat.

How It Works

Prompt injection exploits the inability of ML models to differentiate between user-provided data and system instructions. Attackers craft malicious inputs that trick the model into executing unintended commands, potentially leading to data exfiltration, unauthorized actions, or behavioral manipulation. This collection provides insights into various attack vectors, including direct and indirect injection, and highlights techniques for detection and defense.

Quick Start & Requirements

  • Tools: Garak (Python 3.x) for LLM vulnerability scanning, Token Turbulenz (Python 3.x) for prompt injection fuzzing.
  • CTFs: Gandalf (requires interaction with a specific LLM setup), Promptalanche (scenario-based).
  • Resources: Links to articles, tutorials, and research papers are provided within the README.

Highlighted Details

  • Focuses on both direct and indirect prompt injection techniques.
  • Includes practical tools like Garak for automated LLM vulnerability scanning.
  • Features CTF challenges (Gandalf, Promptalanche) for hands-on learning.
  • Curates research papers detailing real-world attack scenarios and transferable adversarial attacks.

Maintenance & Community

  • Open to community contributions via contribution guidelines.
  • Links to the Learn Prompting Discord server for community discussion.

Licensing & Compatibility

  • The repository itself is not licensed. Individual tools and resources may have their own licenses.

Limitations & Caveats

  • This is a resource collection, not a software project with installable code. Practical application of tools requires separate setup and understanding of their dependencies.
  • Some CTF challenges may require specific LLM access (e.g., ChatGPT Plus with Browsing).
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
49 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
2 more.

llm-security by greshake

0.2%
2k
Research paper on indirect prompt injection attacks targeting app-integrated LLMs
created 2 years ago
updated 2 weeks ago
Feedback? Help us improve.