agent-threat-rules by Agent-Threat-Rule

Open standard for AI agent threat detection

Created 3 months ago

281 stars

Top 92.6% on SourcePulse

Project Summary

Summary

ATR (Agent Threat Rules) addresses the emerging security threat landscape of AI agents, vulnerable to attacks like prompt injection and tool poisoning. It provides an open, vendor-neutral, machine-readable rule format, analogous to Sigma for SIEMs, enabling standardized detection of AI agent threats. This benefits engineers and security professionals by offering a peer-reviewable standard to identify and mitigate risks in AI agent infrastructure.

How It Works

ATR rules are YAML documents conforming to a versioned schema (ATR-YYYY-NNNNN), specifying attack patterns, target input fields (LLM I/O, tool arguments), testing, and mappings to taxonomies like OWASP Agentic and MITRE ATLAS. The project includes a reference TypeScript engine and a Python wrapper (pyatr), designed with a narrow schema for broad engine compatibility across languages (Go, Rust). This offers a standardized, unambiguous method for AI agent threat detection.

Quick Start & Requirements

Installation: Node.js/TypeScript: npm install agent-threat-rules; Python: pip install pyatr; GitHub Action: uses: Agent-Threat-Rule/agent-threat-rules@v1.
Prerequisites: Node.js and npm, or Python.
Links: npm: https://www.npmjs.com/package/agent-threat-rules; PyPI: https://pypi.org/project/pyatr; GitHub Action: https://github.com/marketplace/actions/atr-scan; Specification: SPEC.md.

Highlighted Details

Production Adoption: Integrated into Microsoft AGT, Cisco AI Defense, MISP, and Gen Digital Sage.
Broad Coverage: Maps to 10/10 OWASP Agentic Top 10 categories and achieves 91.8% SAFE-MCP coverage.
High Efficacy: Demonstrates strong performance, including 98.0% recall on NVIDIA garak "in-the-wild" jailbreaks and 99.1% recall on Anthropic's hh-rlhf dataset.
Standardization Efforts: Developing proposal-stage scaffolding for potential OASIS Open Project submission.

Maintenance & Community

The project is transitioning from a single-maintainer (BDFL, Adam Lin) to a Technical Steering Committee (TSC). Maintenance is supported by community sponsorship via Open Collective. Contributions are welcomed via a streamlined issue-to-proposal process.

Licensing & Compatibility

Licensed under the permissive MIT License, allowing commercial use and integration into closed-source projects without copyleft restrictions.

Limitations & Caveats

The ATR rule format is a Working Draft (v3.0.0-alpha.1), and standardization is proposed. The regex-based detection misses paraphrased attacks and novel patterns, evidenced by zero recall on academic paraphrase corpora. ATR is a content layer; pair with sandboxing and human oversight for high-risk actions. The project currently has a single primary maintainer, with plans to increase this to mitigate bus factor.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

145

Issues (30d)

Star History

55 stars in the last 30 days