Semia by berabuddies

Auditing AI agent skills for security vulnerabilities

Created 2 months ago

555 stars

Top 56.9% on SourcePulse

Project Summary

Summary

Semia addresses the security risks of AI agent skills by providing an automated, evidence-backed audit. It analyzes skills (markdown files with embedded code) without execution, detailing every potential action, effect, and sensitive data access. This empowers developers and users to trust skills by understanding their precise capabilities, moving beyond superficial README reviews.

How It Works

Semia treats agent skills as static data, employing a deterministic pipeline: prepare, synthesize (via LLM), detect, and report. It maps a skill's behavior by extracting facts and identifying potential risks, grounding each finding in specific source lines. This approach ensures a verifiable and reproducible security assessment, offering a robust alternative to manual inspection.

Quick Start & Requirements

Installation: pip install semia-audit
LLM Provider: Required for synthesis unless using host plugins (Codex, Claude Code, OpenClaw). Supports OpenAI (default), Anthropic, DeepSeek, vLLM, and local CLIs. Configure via environment variables (e.g., OPENAI_API_KEY).
Integration: Plugins available for Codex, Claude Code, and OpenClaw CLIs.
Output: Generates report.md, with options for SARIF 2.1.0 (--format sarif) for GitHub Code Scanning or structured JSON (--format json).
Repair: Offers an automated semia repair command.
Docs: Project background and technique detailed in arXiv:2605.00314.

Highlighted Details

Evidence-Grounded Audits: All findings are precisely linked to specific lines of source code within the skill.
GitHub Code Scanning Integration: Native support for SARIF 2.1.0 output facilitates direct integration into CI/CD workflows.
Automated Remediation: The semia repair command leverages LLMs to suggest patches or security constraints.
Deterministic Core: The prepare, detect, and report stages are deterministic, ensuring consistent results independent of LLM variability.

Maintenance & Community

Contributions are welcomed via CONTRIBUTING.md. No specific community channels (e.g., Discord, Slack) or notable sponsorships are detailed in the README.

Licensing & Compatibility

Released under the Apache License 2.0. This license generally permits commercial use and integration with closed-source projects.

Limitations & Caveats

The 'synthesize' stage of the audit requires configuration of an external LLM provider, adding a dependency unless integrated with host agents like Codex or Claude Code. The project's core technique is detailed in a recent arXiv paper, suggesting it is a developing area.

Health Check

Last Commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days