harness by awizemann

AI agent for simulating user testing

Created 2 months ago

295 stars

Top 89.6% on SourcePulse

Project Summary

Summary

Harness provides AI-driven user testing for iOS Simulator, macOS, and web applications. It allows users to define goals in plain language, which an LLM agent then executes by interacting with the UI, simulating real user behavior and identifying UX friction. This offers a powerful alternative to traditional scripted UI tests for developers and QA teams.

How It Works

The core approach employs an LLM agent that interprets plain-language goals and personas to drive UI interactions. It leverages platform-specific drivers: WebDriverAgent for iOS Simulator, native macOS APIs (CGEvent, AXUIElement) for macOS apps, and WKWebView for web apps. A key innovation is "Set-of-Mark" targeting, which overlays numbered badges on interactive elements, enabling the agent to target them precisely by ID (tap_mark(id)) rather than unreliable screen coordinates. This significantly improves interaction reliability. The system also supports local LLM inference via Ollama, allowing for private, offline testing at zero cost, albeit with a performance trade-off. Outputs include goal completion status, a replayable action sequence, and detailed friction reports.

Quick Start & Requirements

Primary install/run: Clone the repo, update submodules, install xcodegen and idb-companion via Homebrew, run xcodegen generate, and open the Xcode project.
Prerequisites: macOS 14+ (Apple Silicon & Intel), Swift 6. Requires Homebrew for xcodegen and idb-companion. macOS Screen Recording and Accessibility permissions are necessary.
Resource Footprint: App size is ~12 MB. Initial WebDriverAgent build takes ~1-2 minutes.
Links: Website, Wiki, Roadmap.

Highlighted Details

Local LLM Inference: Integrates with Ollama (Qwen3-VL, Gemma, Llama) for private, offline, zero-cost testing, keeping screenshots local.
Set-of-Mark Targeting: Replaces pixel-based interaction with element ID targeting (tap_mark(id)) across all platforms using accessibility trees and shadow DOM probing.
Smart Settle Gates: Employs dHash (screenshots) and DOM stability checks (web) to ensure UI states are captured accurately, avoiding mid-render issues.
harness-cli: A development-time driver sharing core logic, enabling rapid iteration on prompts and models without full application rebuilds.
Secure Credential Handling: Per-application credential storage allows the agent to fill forms securely, with sensitive data masked from LLM context and logs.

Maintenance & Community

The project emphasizes clear contribution guidelines (CONTRIBUTING.md) and maintains its Wiki alongside code. The roadmap is documented in docs/ROADMAP.md.

Licensing & Compatibility

License: MIT License, generally permissive for commercial use.
Compatibility: Web app testing is currently limited to WebKit; Chrome support is on the roadmap.

Limitations & Caveats

The project is in alpha (v0.5.0), indicating potential instability. Local LLM inference is significantly slower and may yield lower-quality friction reports compared to cloud-based models. Web testing is restricted to WebKit environments.

Health Check

Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

17 stars in the last 30 days