autoagent  by kevinrgu

Autonomous agent harness engineering framework

Created 2 days ago

New!

2,463 stars

Top 18.3% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the complex and time-consuming process of engineering AI agent harnesses by introducing an autonomous, iterative development loop. It targets engineers and researchers seeking to optimize agent performance without direct manual code modification. The core benefit is enabling AI agents to autonomously build, test, and refine their own harnesses overnight, driven by performance metrics.

How It Works

AutoAgent employs a meta-agent approach where human engineers define the desired agent behavior and engineering loop within a program.md file. This meta-agent then autonomously modifies the primary harness file, agent.py, which contains the agent's configuration, tools, and orchestration logic. The system iteratively runs benchmark tasks defined in the tasks/ directory, evaluates the resulting score, and either keeps or discards the modifications to agent.py, effectively hill-climbing towards optimal performance. This design shifts the programming paradigm from modifying harness code directly to programming the meta-agent's instructions.

Quick Start & Requirements

  • Primary install/run command: Uses uv for dependency management and docker for environment isolation. Key commands include uv sync, docker build -f Dockerfile.base -t autoagent-base ., and uv run harbor run ....
  • Non-default prerequisites: Docker, Python 3.10+, uv, and model-provider credentials (e.g., OPENAI_API_KEY).
  • Links: Mentions Harbor documentation for task format details.

Highlighted Details

  • Autonomous agent harness development via a meta-agent that modifies agent.py.
  • Single-file, registry-driven harness design (agent.py) for simplicity and maintainability.
  • Score-driven iteration loop, where changes are kept only if they improve benchmark performance.
  • Harbor-compatible task format allows for consistent evaluation across different datasets.
  • Docker isolation ensures agent execution does not affect the host system.

Maintenance & Community

The project is actively seeking engineers, with contact information provided for inquiries (hello@thirdlayer.inc). Specific community channels or contributor details are not detailed in the README.

Licensing & Compatibility

  • License type: MIT.
  • Compatibility notes: The MIT license is highly permissive, allowing for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Users must manually define and add evaluation tasks to the tasks/ directory. The project appears to be in active development, with a product launch anticipated soon. Regular maintenance is required to clean up accumulating Docker images and containers.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
1
Star History
2,532 stars in the last 2 days

Explore Similar Projects

Feedback? Help us improve.