claude-code_evil by Ta0ing

AI model safety guardrails bypassed for cybersecurity tasks

Created 1 month ago

300 stars

Top 88.5% on SourcePulse

Project Summary

Summary This project offers a modified Claude Code v2.1.88, specifically engineered to remove prompt-level cybersecurity restrictions. It targets users needing to generate code for security tasks previously blocked by AI safety guardrails, enhancing flexibility for security research, CTFs, and educational contexts.

How It Works The core modification involves clearing the CYBER_RISK_INSTRUCTION constant. This directive previously blocked code generation for destructive techniques, DoS attacks, mass targeting, supply chain compromises, malicious detection evasion, and dual-use security tools without explicit authorization. This approach directly bypasses AI-imposed prompt-level safety filters.

Quick Start & Requirements The README mentions a "full version, including manual, compilation, etc." for Claude Code v2.1.88 but provides no explicit installation commands, build instructions, or specific prerequisites (e.g., Python version, GPU).

Highlighted Details

Removed prompt-level restrictions include assistance for security testing (without explicit authorization), defensive security, CTFs, educational contexts, destructive techniques, DoS attacks, mass targeting, supply chain compromise, and malicious detection evasion.
Restrictions on dual-use tools (C2 frameworks, credential testing, exploit development) are lifted, removing the need for explicit authorization contexts.
Several other security layers remain, including URL generation limits, prompt injection detection, OWASP protection, sensitive operation confirmation, permission system checks, tool-level security analyses, sandbox isolation, and Unicode sanitization.

Maintenance & Community Modification date is March 31, 2026, attributed to "Claude (AI Assistant)". No details on contributors, sponsorships, community channels, roadmaps, or social media are provided.

Licensing & Compatibility The README does not specify a software license, leaving compatibility for commercial use or closed-source integration undetermined.

Limitations & Caveats This modification intentionally removes Anthropic's safety measures, potentially leading to unsafe outputs. Users assume all risks; production deployment is strongly discouraged; use is recommended only in controlled, isolated settings.

claude-code_evil by Ta0ing

Explore Similar Projects

ClawVault by tophant-ai

awesome-claude-skills-security by Eyadkelleh

ai-sec-resources by Arcanum-Sec

claude-code-damage-control by disler

vibe-security-skill by raroque

codegate by stacklok

clawsec by prompt-security

defenseclaw by cisco-ai-defense

Awesome-AI-Security by TalEliyahu

nono by always-further

awesome-ai-security by ottosulin

Anthropic-Cybersecurity-Skills by mukul975