claude-code_evil  by Ta0ing

AI model safety guardrails bypassed for cybersecurity tasks

Created 1 week ago

New!

272 stars

Top 94.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary This project offers a modified Claude Code v2.1.88, specifically engineered to remove prompt-level cybersecurity restrictions. It targets users needing to generate code for security tasks previously blocked by AI safety guardrails, enhancing flexibility for security research, CTFs, and educational contexts.

How It Works The core modification involves clearing the CYBER_RISK_INSTRUCTION constant. This directive previously blocked code generation for destructive techniques, DoS attacks, mass targeting, supply chain compromises, malicious detection evasion, and dual-use security tools without explicit authorization. This approach directly bypasses AI-imposed prompt-level safety filters.

Quick Start & Requirements The README mentions a "full version, including manual, compilation, etc." for Claude Code v2.1.88 but provides no explicit installation commands, build instructions, or specific prerequisites (e.g., Python version, GPU).

Highlighted Details

  • Removed prompt-level restrictions include assistance for security testing (without explicit authorization), defensive security, CTFs, educational contexts, destructive techniques, DoS attacks, mass targeting, supply chain compromise, and malicious detection evasion.
  • Restrictions on dual-use tools (C2 frameworks, credential testing, exploit development) are lifted, removing the need for explicit authorization contexts.
  • Several other security layers remain, including URL generation limits, prompt injection detection, OWASP protection, sensitive operation confirmation, permission system checks, tool-level security analyses, sandbox isolation, and Unicode sanitization.

Maintenance & Community Modification date is March 31, 2026, attributed to "Claude (AI Assistant)". No details on contributors, sponsorships, community channels, roadmaps, or social media are provided.

Licensing & Compatibility The README does not specify a software license, leaving compatibility for commercial use or closed-source integration undetermined.

Limitations & Caveats This modification intentionally removes Anthropic's safety measures, potentially leading to unsafe outputs. Users assume all risks; production deployment is strongly discouraged; use is recommended only in controlled, isolated settings.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
272 stars in the last 11 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

codegate by stacklok

0%
792
AI agent security and management tool
Created 1 year ago
Updated 10 months ago
Feedback? Help us improve.