Discover and explore top open-source AI tools and projects—updated daily.
stanford-iris-labAutomated search framework for optimizing model harnesses
Top 37.6% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Meta-Harness offers a framework for automating the search and optimization of task-specific model harnesses, which are the components surrounding a base model that manage its interaction with the environment (storage, retrieval, display). It targets researchers and developers aiming to enhance AI system performance through end-to-end harness optimization. The repository includes the core framework and two reference experiments for text classification and Terminal-Bench 2.0.
How It Works
<2-4 sentences on core approach / design (key algorithms, models, data flow, or architectural choices) and why this approach is advantageous or novel.> The framework enables automated search over model harnesses, optimizing elements like memory systems and scaffold evolution. This approach aims to improve the efficiency and effectiveness of AI agents by refining their interaction logic. It features a reusable framework and an onboarding process that leverages a coding assistant to generate domain specifications. Examples assume a "proposer agent" (e.g., Claude Code) requiring a specific wrapper for logging interactions.
Quick Start & Requirements
uv sync and uv run for dependency management and execution.https://arxiv.org/abs/2603.28052https://github.com/stanford-iris-lab/meta-harness-tbench2-artifactONBOARDING.md (within the repo)Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
<1-3 sentences on caveats: unsupported platforms, missing features, alpha status, known bugs, breaking changes, bus factor, deprecation, etc. Avoid vague non-statements and judgments.> Codebase is a cleaned-up paper version, with testing limited to basic execution verification. Detailed setup and runtime instructions are located in subdirectory READMEs. Adapting to new domains necessitates implementing a "proposer agent" wrapper, with provided examples tailored for Claude Code. Absence of license information prevents assessment of usage restrictions.
4 weeks ago
Inactive
huggingface