garak by NVIDIA

LLM vulnerability scanner for red-teaming and security assessments

Created 2 years ago

6,741 stars

Top 7.5% on SourcePulse

View on GitHub

9 Experts Love This Project

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Luis Capelo

Cofounder of Lightning AI

Elvis Saravia

Founder of DAIR.AI

Dan Guido

Cofounder of Trail of Bits

and 5 more!

Project Summary

Garak is an open-source LLM vulnerability scanner designed for security researchers and developers to identify weaknesses in generative AI models. It automates the process of red-teaming LLMs by probing for issues like hallucination, data leakage, prompt injection, misinformation, toxicity, and jailbreaks, offering a structured approach similar to network security tools like nmap.

How It Works

Garak employs a modular architecture, combining static, dynamic, and adaptive probes to systematically explore LLM vulnerabilities. It supports a wide range of LLM interfaces, including Hugging Face Hub, Replicate, OpenAI API, LiteLLM, and local GGUF models, allowing users to target diverse models. The tool orchestrates probes and detectors, analyzes outputs, and logs detailed results for each interaction.

Quick Start & Requirements

Install: python -m pip install -U garak
Development install: python -m pip install -U git+https://github.com/NVIDIA/garak.git@main
Conda install: Clone repo, conda create --name garak "python>=3.10,<=3.12", conda activate garak, cd garak, python -m pip install -e .
Prerequisites: Python 3.10-3.12. API keys may be required for specific model providers (e.g., OPENAI_API_KEY).
Documentation: docs.garak.ai

Highlighted Details

Supports a broad spectrum of LLM generators including Hugging Face (local, Inference API, private endpoints), OpenAI, Replicate, Cohere, Groq, GGUF, REST endpoints, NVIDIA NIM, and OctoAI.
Features a comprehensive suite of probes covering prompt injection, data leakage, toxicity, misinformation, and more, with extensibility for custom probes.
Generates detailed JSONL reports and logs for analysis, including a hit log for identified vulnerabilities.
Offers a plugin-based architecture for easy extension of probes, detectors, generators, and harnesses.

Maintenance & Community

Developed by NVIDIA, with contributions from various individuals.
Community support via Discord.
Project links and updates available at garak.ai and Twitter @garak_llm.

Licensing & Compatibility

Licensed under the Apache License 2.0.
Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The atkgen probe is currently a prototype and primarily supports targets that yield detectable toxicity. Some probes may require specific configurations or API keys for operation.

Health Check

Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

198 stars in the last 30 days