autocontext by greyhaven-ai

Agent improvement system for validated, reusable execution

Created 5 months ago

1,241 stars

Top 31.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

John Resig

Author of jQuery; Chief Software Architect at Khan Academy

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Autocontext offers a closed-loop system to enhance AI agent performance across repeated executions. It solves the problem of agents starting "cold" by implementing a feedback mechanism that captures successful strategies, updates persistent knowledge, and distills validated behaviors into cost-effective local runtimes. This enables a shift from exploratory frontier model usage to reliable, reusable, and cheaper agent execution.

How It Works

The system uses a structured multi-agent loop for proposal, analysis, coaching, and architectural refinement. Strategies undergo rigorous evaluation via scenario execution, staged validation, and gating, with rollbacks for weak changes. Successful adaptations are accumulated into persistent knowledge bases (playbooks, hints, tools, reports) that inform subsequent runs. A key feature is the frontier-to-local distillation process.

Quick Start & Requirements

Installation leverages uv for environment and dependency management. Navigate to autocontext, create/activate a virtual environment (uv venv, source .venv/bin/activate), and sync dev dependencies (uv sync --group dev). A local quick-start run, requiring no API keys, is: uv run autoctx run --scenario grid_ctf --gens 3 --run-id quickstart. Artifacts are stored under runs/ and knowledge/. Anthropic integration requires API keys. MLX training needs Apple Silicon macOS. Key docs: autocontext/README.md, autocontext/docs/mlx-training.md.

Highlighted Details

Persistent knowledge management: stores playbooks, hints, tools, reports, and progress snapshots across runs.
Staged validation and harness synthesis for robust execution.
Frontier-to-local distillation, optimized with MLX on Apple Silicon.
Flexible runtime routing: Anthropic, OpenAI-compatible, Ollama, vLLM, MLX, Pi.
Extensive integration surfaces: CLI, API server, dashboard, TypeScript/TUI.

Maintenance & Community

The repository was previously known as MTS. Specific details on community channels (e.g., Discord, Slack), active contributors, sponsorships, or a public roadmap are not detailed in the provided README.

Licensing & Compatibility

The specific open-source license is not explicitly stated in the provided README text, though a LICENSE file is referenced. Compatibility for commercial use or linking with closed-source projects would require clarification of the license terms.

Limitations & Caveats

MLX-based training is exclusively supported on Apple Silicon macOS. The system's effectiveness relies on robust feedback and validation loops; initial frontier model runs may incur higher costs before distillation becomes viable.

Health Check

Last Commit

22 hours ago

Responsiveness

Inactive

Pull Requests (30d)

134

Issues (30d)

Star History

49 stars in the last 30 days