CHYing-agent  by yhy0

AI agent for automated cybersecurity penetration testing

Created 2 months ago
284 stars

Top 92.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project, yhy0/CHYing-agent, addresses CTF challenge automation using AI agents. It offers a practical framework for AI penetration testing, mitigating LLM hallucinations and streamlining tool integration. Targeted at cybersecurity and AI professionals, it showcases how minimalist design and LLM collaboration yield high performance, evidenced by its Top 9 Tencent Cloud Hacker Marathon ranking.

How It Works

The core is a "Dual Agent" architecture: a strategic "Advisor Agent" and an execution-focused "Main Attacker Agent." The Advisor provides high-level guidance and reviews, mitigating LLM confabulation in long contexts. The Main Attacker executes tasks using a deliberately limited, powerful toolset. This "incomplete trust" engineering philosophy balances AI flexibility with safeguards, leveraging LLMs for CTF's "known unknown" problems. The design prioritizes clear LLM decision spaces over complex tool management.

Quick Start & Requirements

The competition version is on the tx-tch branch (QUICKSTART.md for details). Requires Kali Linux (or Docker) with tools (sqlmap, ffuf, curl), Python scripting, and LLM API access (Claude, DeepSeek, MiniMax mentioned). Docker is used for execution. Setup time/footprint not detailed.

Highlighted Details

  • Achieved 9th place in the Tencent Cloud Hacker Marathon - Intelligent Penetration Challenge.
  • Employs a "Dual Agent" system (Advisor + Main Attacker) for collaborative strategy and hallucination reduction.
  • Features a minimalist toolset: execute_command (Kali commands), execute_python_poc (Python scripts), and submit_flag.
  • Demonstrates AI-generated code implementation, with Claude assisting in generating most of the code based on provided designs.

Maintenance & Community

Open-source project by yhy0. Community channels (Discord, Slack), roadmap, or sponsorships are not detailed. References langgraph and Cyber-AutoAgent.

Licensing & Compatibility

License type and restrictions (e.g., commercial use) are unspecified. Compatibility demonstrated within Linux/Docker environments for penetration testing.

Limitations & Caveats

An experimental LLM role-swapping feature was not fully validated due to API cost constraints. The Main Agent's dense prompt mixes strategy, code norms, and tool usage, causing execution errors. Future refactoring aims for a layered architecture with specialized agents to reduce cognitive load.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
52 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.