slowmist-agent-security  by slowmist

AI agent security framework for adversarial environments

Created 1 month ago
432 stars

Top 68.6% on SourcePulse

GitHubView on GitHub
Project Summary

This framework provides a comprehensive security review system for AI agents operating in adversarial environments, with the core principle that all external inputs are untrusted until verified. It offers a structured approach to auditing code, analyzing URLs, assessing on-chain activity, and evaluating products, benefiting users of LLM-based agents by enhancing their security posture against malicious inputs and supply chain attacks.

How It Works

The framework integrates as a "skill" within LLM agent systems like OpenClaw, automatically applying security reviews when encountering new skills, GitHub repositories, external URLs, blockchain addresses, or product recommendations. It leverages predefined patterns for dangerous code, social engineering, and supply chain attacks, combined with detailed review guides for specific domains. A risk rating system and trust hierarchy guide the agent's actions, from informing the user to outright refusal of potentially malicious inputs.

Quick Start & Requirements

Installation can be done via direct download by cloning the repository into an OpenClaw workspace (~/.openclaw/workspace/skills) or via ClawHub (clawhub install slowmist-agent-security) when available. The framework is designed to work with LLM-based agent systems, with OpenClaw used for demonstration.

Highlighted Details

  • Comprehensive Review Areas: Covers Skill/MCP installation, GitHub repository audits, URL/document analysis (prompt injection, social engineering), on-chain address review (AML risk), product/service evaluation, and social share validation.
  • Pattern Detection: Includes categorized patterns for code-level red flags (11 categories), social engineering (8 categories), and supply chain attacks (7 categories).
  • Risk Management: Employs a four-tier risk rating system (LOW, MEDIUM, HIGH, REJECT) with corresponding agent actions and a trust hierarchy to determine scrutiny levels based on source type.
  • Optional Integration: Supports integration with MistTrack Skills for on-chain AML risk assessment.

Maintenance & Community

This framework is maintained by SlowMist, a security company. Contributions in the form of new attack patterns, detection rules, or review templates are welcomed. Further information can be found at https://slowmist.com.

Licensing & Compatibility

The project is released under the MIT License, permitting free use, modification, and distribution. This license generally allows for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

The "ClawHub (when available)" installation option suggests this feature may not be universally accessible or fully implemented. While the framework is designed for systems like OpenClaw and Hermes Agent, integration with other LLM agent architectures may require custom adaptation.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
310 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.