autoharness  by kayba-ai

Autonomous optimization for agent harnesses

Created 3 weeks ago

New!

266 stars

Top 96.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an autonomous control plane for optimizing agent harnesses, enabling users to automatically improve prompt, configuration, middleware, and source code changes. It targets engineers and researchers seeking to enhance the reliability and performance of production agents by iteratively refining their underlying harnesses based on automated evaluations. The primary benefit is reduced manual effort in optimization and a more robust agent system.

How It Works

Autoharness operates by inspecting a target harness repository and defining an optimization campaign. It utilizes a guide command to set up an autoharness.yaml configuration, which specifies a benchmark command for evaluating candidate changes. The system supports various adapters for running benchmarks (e.g., pytest, harbor) and multiple proposal generators (e.g., openai_responses, codex_cli, claude_code) to create potential improvements. Iterative optimize runs generate, evaluate, and promote candidate changes, persisting state and champions within the .autoharness/ directory. This approach allows for resumable, automated optimization loops.

Quick Start & Requirements

  • Primary install: pipx install "git+https://github.com/kayba-ai/autoharness.git"
  • Setup: Navigate to your harness repository and run autoharness guide. For AI-assisted setup, use --assistant codex --print-next-prompt or --assistant claude --print-next-prompt.
  • Prerequisites: Python 3.x. May require API keys for AI models (e.g., OpenAI) if using model-backed generators. The harness being optimized will have its own dependencies.
  • Links: Repository: https://github.com/kayba-ai/autoharness.git

Highlighted Details

  • Autonomous optimization of agent harnesses via prompt, config, middleware, and source code changes.
  • Supports extensible adapters (e.g., pytest, harbor) and proposal generators (e.g., openai_responses, codex_cli, claude_code).
  • Integrates with AI assistants like Codex and Claude for guided setup and prompt generation.
  • Extensible via Python plugins for custom generators, preflight checks, and search strategies.

Maintenance & Community

The project is developed by Kayba and the open-source community. No specific community channels (like Discord/Slack), roadmap links, or notable contributor details are provided in the README.

Licensing & Compatibility

The license is not explicitly stated in the provided README. This lack of clarity presents a significant adoption blocker, particularly for commercial use or integration into closed-source projects.

Limitations & Caveats

Optimization outcomes are contingent on the specific benchmark, harness implementation, and evaluation setup; certain interventions may lead to regressions. The setup process can be complex, potentially requiring AI model API keys or specific configurations. The absence of a stated license is a critical caveat for adoption.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
266 stars in the last 23 days

Explore Similar Projects

Starred by Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
3 more.

Trace by microsoft

0.5%
735
AutoDiff-like tool for end-to-end AI agent training with general feedback
Created 1 year ago
Updated 5 months ago
Feedback? Help us improve.