holmesgpt  by robusta-dev

AI agent for alert root cause analysis

created 1 year ago
1,140 stars

Top 34.4% on sourcepulse

GitHubView on GitHub
Project Summary

HolmesGPT is an AI-powered on-call agent designed to accelerate alert resolution by automatically correlating and investigating issues across observability data and organizational knowledge. It targets SREs, DevOps engineers, and operations teams seeking to reduce Mean Time To Resolution (MTTR) by automating root cause analysis.

How It Works

HolmesGPT employs an agentic loop, connecting large language models with live observability data and internal documentation. It integrates with various data sources (Kubernetes, Grafana, Helm, AWS RDS, etc.) to fetch logs, traces, and metrics, enabling it to determine if issues stem from applications or infrastructure and identify upstream root causes.

Quick Start & Requirements

  • Installation: Robusta SaaS (Kubernetes required) or Desktop CLI/K9s plugin.
  • Prerequisites: LLM API key. Kubernetes for SaaS.
  • Usage:
    • SaaS: platform.robusta.dev
    • CLI: holmes ask "what pods are unhealthy and why?"
    • Alertmanager: holmes investigate alertmanager --alertmanager-url <URL>
  • Docs: How it Works, Quick Start, YouTube Demo

Highlighted Details

  • Supports 10+ data sources including Kubernetes, Grafana, Helm, ArgoCD, AWS RDS, Prometheus, Confluence, and GitHub.
  • Offers bi-directional integrations with Slack, Prometheus/AlertManager, PagerDuty, OpsGenie, and Jira.
  • Provides read-only access, respects RBAC, and does not train on user data for production safety.
  • Allows customization of data sources and runbooks for specific alert scenarios.

Maintenance & Community

Licensing & Compatibility

  • MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

  • Several integrations (OpenSearch, NewRelic, Coralogix, GitHub, Slack) are in Beta status.
  • Robusta SaaS installation requires Kubernetes.
Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
166
Issues (30d)
36
Star History
297 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.