autoagent  by kevinrgu

Autonomous agent harness engineering framework

Created 1 month ago
4,443 stars

Top 11.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the complex and time-consuming process of engineering AI agent harnesses by introducing an autonomous, iterative development loop. It targets engineers and researchers seeking to optimize agent performance without direct manual code modification. The core benefit is enabling AI agents to autonomously build, test, and refine their own harnesses overnight, driven by performance metrics.

How It Works

AutoAgent employs a meta-agent approach where human engineers define the desired agent behavior and engineering loop within a program.md file. This meta-agent then autonomously modifies the primary harness file, agent.py, which contains the agent's configuration, tools, and orchestration logic. The system iteratively runs benchmark tasks defined in the tasks/ directory, evaluates the resulting score, and either keeps or discards the modifications to agent.py, effectively hill-climbing towards optimal performance. This design shifts the programming paradigm from modifying harness code directly to programming the meta-agent's instructions.

Quick Start & Requirements

  • Primary install/run command: Uses uv for dependency management and docker for environment isolation. Key commands include uv sync, docker build -f Dockerfile.base -t autoagent-base ., and uv run harbor run ....
  • Non-default prerequisites: Docker, Python 3.10+, uv, and model-provider credentials (e.g., OPENAI_API_KEY).
  • Links: Mentions Harbor documentation for task format details.

Highlighted Details

  • Autonomous agent harness development via a meta-agent that modifies agent.py.
  • Single-file, registry-driven harness design (agent.py) for simplicity and maintainability.
  • Score-driven iteration loop, where changes are kept only if they improve benchmark performance.
  • Harbor-compatible task format allows for consistent evaluation across different datasets.
  • Docker isolation ensures agent execution does not affect the host system.

Maintenance & Community

The project is actively seeking engineers, with contact information provided for inquiries (hello@thirdlayer.inc). Specific community channels or contributor details are not detailed in the README.

Licensing & Compatibility

  • License type: MIT.
  • Compatibility notes: The MIT license is highly permissive, allowing for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Users must manually define and add evaluation tasks to the tasks/ directory. The project appears to be in active development, with a product launch anticipated soon. Regular maintenance is required to clean up accumulating Docker images and containers.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
1
Star History
169 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.