Discover and explore top open-source AI tools and projects—updated daily.
pat-jjSearch agents trained with reinforcement learning for long-horizon tasks
New!
Top 57.9% on SourcePulse
Summary
Harness-1 is a 20-billion parameter search agent designed for long-horizon tasks, leveraging reinforcement learning within a stateful retrieval harness. It addresses the challenge of managing complex search states, enabling agents to make semantic decisions about searching, curating evidence, and verifying claims. This project is targeted at researchers and engineers seeking advanced AI search capabilities, offering a robust framework for stateful, recoverable search operations.
How It Works
The core innovation lies in its stateful retrieval harness, which meticulously maintains recoverable search state, including candidate documents, curated evidence, verification records, and budget-aware context. A reinforcement learning policy governs semantic decisions, dictating search queries, document inspection, evidence curation, and the determination of sufficient evidence. This approach allows for more sophisticated and persistent search trajectories compared to stateless models.
Quick Start & Requirements
For a minimal local smoke test, users need Linux with Python 3.11+, uv installed, and a CUDA-compatible NVIDIA GPU environment with vLLM and GPT-OSS support. The primary installation involves uv sync --extra vllm and setting the HARNESS1_HF_MODEL environment variable to pat-jj/harness-1. Full BrowseComp+ evaluation requires additional setup, including BrowseComp+ data files, a Chroma collection, and OpenAI API credentials. Detailed guides are available in docs/run_vllm_browsecompplus.md.
Highlighted Details
Maintenance & Community
The project is associated with authors Pengcheng Jiang, Zhiyi Shi, Kelly Hong, et al. Support and bug reporting are managed through the repository's issue tracker. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The README does not explicitly state the software license. This lack of clarity may pose a risk for commercial use or integration into closed-source projects, requiring further investigation.
Limitations & Caveats
Full BrowseComp+ evaluation necessitates a compatible Chroma retrieval backend and associated data, which are not bundled with the repository. Results may exhibit variance due to external retrieval and reranking services. Local serving requires a CUDA GPU with sufficient memory, with H100-class hardware being the validated configuration; other GPUs may function but are not guaranteed. Certain training and model export workflows depend on private Tinker checkpoints or hosted services.
3 days ago
Inactive