AIOpsLab  by microsoft

AIOps agent framework for design, development, and evaluation

Created 11 months ago
685 stars

Top 49.6% on SourcePulse

GitHubView on GitHub
Project Summary

AIOpsLab is a comprehensive framework for designing, developing, and evaluating autonomous AIOps agents. It targets researchers and engineers building AI-driven solutions for cloud operations, offering a standardized and reproducible environment for agent testing and benchmarking. The framework simplifies the deployment of complex microservice environments, fault injection, workload generation, and telemetry data collection.

How It Works

AIOpsLab orchestrates microservice cloud environments, enabling the simulation of real-world operational scenarios. It supports deploying applications via Helm charts and managing Kubernetes clusters (local via kind or remote). Agents interact with these environments through a defined interface, receiving state information and returning actions. The framework facilitates the creation of custom problems by defining applications, tasks (detection, localization, analysis, mitigation), faults, workloads, and evaluation metrics, promoting extensibility and standardization.

Quick Start & Requirements

  • Installation: Recommended via Poetry (poetry install). Requires Python >= 3.11.
  • Cluster Setup:
    • Local: Use kind with provided YAML configurations (kind create cluster --config kind/kind-config-x86.yaml).
    • Remote: Supports any Kubernetes cluster configured via kubectl.
  • Configuration: Update config.yml with cluster host and user details.
  • Agent Execution:
    • Human agent: python3 cli.py
    • GPT-4 baseline: export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>; python3 clients/gpt.py
  • Documentation: Overview, Quick Start, Installation, Usage.

Highlighted Details

  • Supports both local simulated clusters (using kind) and remote Kubernetes clusters.
  • Provides a structured approach to define AIOps problems, including applications, tasks, faults, workloads, and evaluators.
  • Enables easy onboarding of custom agents by requiring a Python class with an async def get_action(self, state: str) -> str method.
  • Includes baseline agents (e.g., GPT-4) for comparative evaluation.

Maintenance & Community

The project is developed by Microsoft. Key contributors are listed in the citation papers. The project adheres to the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

Licensed under the MIT license. This license permits commercial use and linking with closed-source projects.

Limitations & Caveats

The framework relies heavily on Kubernetes and Helm for deployment, requiring familiarity with these technologies. While it supports local simulation via kind, performance and behavior may differ from actual cloud environments. The setup for remote clusters and specific fault/workload injections might require significant configuration effort.

Health Check
Last Commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
14
Issues (30d)
2
Star History
21 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Zhen Lu Zhen Lu(Cofounder of Runpod), and
1 more.

agents-towards-production by NirDiamant

2.2%
13k
Production-ready GenAI agent tutorials
Created 3 months ago
Updated 2 weeks ago
Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

SuperAGI by TransformerOptimus

0.1%
17k
Open-source framework for autonomous AI agent development
Created 2 years ago
Updated 7 months ago
Feedback? Help us improve.