audit by evilsocket

8-stage vulnerability-discovery agent

Created 1 month ago

708 stars

Top 47.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Wes McKinney

Author of Pandas

Project Summary

An 8-stage vulnerability-discovery agent, evilsocket/audit, automates bug hunting by employing a pipeline inspired by Cloudflare's Project Glasswing. It targets developers and security researchers seeking a structured, multi-agent approach to identify vulnerabilities in codebases, leveraging deliberate disagreement and reachability analysis for more reliable findings.

How It Works

This project reimplements the Cloudflare Project Glasswing pipeline, emphasizing a multi-agent strategy over monolithic LLM calls. It utilizes many narrow agents working in parallel on tightly-scoped questions, a second agent on a different model to disprove findings (deliberate disagreement), and a crucial reachability trace to confirm that attacker-controlled input can actually reach a potential vulnerability sink. The 8 stages (Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, Report) are orchestrated using specific LLM models and JSON schemas for each stage, ensuring shape-stable, validated outputs.

Quick Start & Requirements

Install: Create a Python virtual environment, activate it, and run pip install -e ..
Auth: Authenticate via claude login for interactive use or claude setup-token for non-interactive/CI use, generating an OAuth token.
Run: Execute audit run --repo /path/to/target --run-id my-run. Check status with audit status --run-id my-run and generate a report with audit report --run-id my-run --format md > report.md.
Prerequisites: Python. By default, it uses subscription billing via your Claude.ai login. It can be configured to use LLM gateways like OpenRouter or other providers exposing the Anthropic Messages API by setting ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN environment variables.
Links: No specific external documentation or demo links are provided beyond the CLI commands.

Highlighted Details

The Recon stage mines git history for past security patches to seed hunts in related, potentially unpatched files.
Supports "logic chain" tasks for identifying high-impact, multi-component vulnerability paths.
Optional live-target reproduction against a running deployment allows findings to be validated against a live service.
Scope notes can be provided to exclude known non-bug surfaces like test-only endpoints or feature plaintext API keys.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), sponsorships, or roadmap are present in the provided README.

Licensing & Compatibility

The project is MIT-licensed, allowing for free reuse, including in commercial and closed-source applications, with no warranty.

Limitations & Caveats

Hunt agents execute Bash commands within per-task scratch directories and are not sandboxed at the OS level. Running the audit on untrusted target sources could pose a security risk, as malicious build scripts could execute on the host during PoC compilation. The agent also reads all specified directories, including any .env or secrets directories within the target.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

100 stars in the last 30 days