Semia  by berabuddies

Auditing AI agent skills for security vulnerabilities

Created 1 month ago
377 stars

Top 75.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Semia addresses the security risks of AI agent skills by providing an automated, evidence-backed audit. It analyzes skills (markdown files with embedded code) without execution, detailing every potential action, effect, and sensitive data access. This empowers developers and users to trust skills by understanding their precise capabilities, moving beyond superficial README reviews.

How It Works

Semia treats agent skills as static data, employing a deterministic pipeline: prepare, synthesize (via LLM), detect, and report. It maps a skill's behavior by extracting facts and identifying potential risks, grounding each finding in specific source lines. This approach ensures a verifiable and reproducible security assessment, offering a robust alternative to manual inspection.

Quick Start & Requirements

  • Installation: pip install semia-audit
  • LLM Provider: Required for synthesis unless using host plugins (Codex, Claude Code, OpenClaw). Supports OpenAI (default), Anthropic, DeepSeek, vLLM, and local CLIs. Configure via environment variables (e.g., OPENAI_API_KEY).
  • Integration: Plugins available for Codex, Claude Code, and OpenClaw CLIs.
  • Output: Generates report.md, with options for SARIF 2.1.0 (--format sarif) for GitHub Code Scanning or structured JSON (--format json).
  • Repair: Offers an automated semia repair command.
  • Docs: Project background and technique detailed in arXiv:2605.00314.

Highlighted Details

  • Evidence-Grounded Audits: All findings are precisely linked to specific lines of source code within the skill.
  • GitHub Code Scanning Integration: Native support for SARIF 2.1.0 output facilitates direct integration into CI/CD workflows.
  • Automated Remediation: The semia repair command leverages LLMs to suggest patches or security constraints.
  • Deterministic Core: The prepare, detect, and report stages are deterministic, ensuring consistent results independent of LLM variability.

Maintenance & Community

Contributions are welcomed via CONTRIBUTING.md. No specific community channels (e.g., Discord, Slack) or notable sponsorships are detailed in the README.

Licensing & Compatibility

Released under the Apache License 2.0. This license generally permits commercial use and integration with closed-source projects.

Limitations & Caveats

The 'synthesize' stage of the audit requires configuration of an external LLM provider, adding a dependency unless integrated with host agents like Codex or Claude Code. The project's core technique is detailed in a recent arXiv paper, suggesting it is a developing area.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
8
Issues (30d)
9
Star History
380 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.