deepteam by confident-ai

LLM red teaming framework for security testing

Created 10 months ago

1,220 stars

Top 32.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Eric Zhu

Coauthor of AutoGen; Research Scientist at Microsoft Research

Project Summary

DeepTeam is an open-source LLM red teaming framework designed for penetration testing and safeguarding LLM systems. It targets developers and security professionals seeking to identify and mitigate vulnerabilities like bias, PII leakage, and misinformation in chatbots, RAG pipelines, and AI agents. The framework leverages state-of-the-art adversarial attack techniques and provides guardrails for production deployment.

How It Works

DeepTeam simulates adversarial attacks against an LLM system, defined by a model_callback function. It dynamically generates attacks based on a list of specified vulnerabilities, eliminating the need for pre-defined datasets. LLMs are used for both generating these attacks and evaluating the LLM system's responses against defined vulnerability criteria. This approach allows for flexible, on-the-fly testing tailored to specific organizational needs.

Quick Start & Requirements

Installation: pip install -U deepteam
Prerequisites: Requires an OpenAI API key (or other supported providers like Anthropic, Gemini, Azure, Ollama) set as an environment variable or via deepteam set-api-key. Custom LLM integration is supported.
Setup: Minimal setup, primarily involves defining a model_callback function wrapping the target LLM system.
Documentation: LLM Red Teaming Framework Documentation

Highlighted Details

Supports over 40 built-in vulnerabilities (e.g., Bias, PII Leakage, Misinformation) and 10+ adversarial attack methods (e.g., Prompt Injection, Jailbreaking).
Allows customization of vulnerabilities and attacks with minimal code.
Integrates with OWASP Top 10 for LLMs and NIST AI RMF guidelines.
Offers a CLI for running red teaming with YAML configurations and supports custom vulnerability definitions.

Maintenance & Community

Developed by the founders of Confident AI.
Discord server available for community interaction.
Roadmap includes expanding vulnerability and attack coverage.

Licensing & Compatibility

Licensed under Apache 2.0.
Permits commercial use and linking with closed-source applications.

Limitations & Caveats

The framework relies on LLMs for attack generation and evaluation, which may introduce its own biases or limitations. Specific LLM provider configurations and API key management are necessary for operation.

Health Check

Last Commit

2 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

50 stars in the last 30 days