Discover and explore top open-source AI tools and projects—updated daily.
yhy0AI agent for automated cybersecurity penetration testing
Top 92.1% on SourcePulse
Summary
This project, yhy0/CHYing-agent, addresses CTF challenge automation using AI agents. It offers a practical framework for AI penetration testing, mitigating LLM hallucinations and streamlining tool integration. Targeted at cybersecurity and AI professionals, it showcases how minimalist design and LLM collaboration yield high performance, evidenced by its Top 9 Tencent Cloud Hacker Marathon ranking.
How It Works
The core is a "Dual Agent" architecture: a strategic "Advisor Agent" and an execution-focused "Main Attacker Agent." The Advisor provides high-level guidance and reviews, mitigating LLM confabulation in long contexts. The Main Attacker executes tasks using a deliberately limited, powerful toolset. This "incomplete trust" engineering philosophy balances AI flexibility with safeguards, leveraging LLMs for CTF's "known unknown" problems. The design prioritizes clear LLM decision spaces over complex tool management.
Quick Start & Requirements
The competition version is on the tx-tch branch (QUICKSTART.md for details). Requires Kali Linux (or Docker) with tools (sqlmap, ffuf, curl), Python scripting, and LLM API access (Claude, DeepSeek, MiniMax mentioned). Docker is used for execution. Setup time/footprint not detailed.
Highlighted Details
execute_command (Kali commands), execute_python_poc (Python scripts), and submit_flag.Maintenance & Community
Open-source project by yhy0. Community channels (Discord, Slack), roadmap, or sponsorships are not detailed. References langgraph and Cyber-AutoAgent.
Licensing & Compatibility
License type and restrictions (e.g., commercial use) are unspecified. Compatibility demonstrated within Linux/Docker environments for penetration testing.
Limitations & Caveats
An experimental LLM role-swapping feature was not fully validated due to API cost constraints. The Main Agent's dense prompt mixes strategy, code norms, and tool usage, causing execution errors. Future refactoring aims for a layered architecture with specialized agents to reduce cognitive load.
1 month ago
Inactive