agent by trymeka

Autonomous browsing agent for state-of-the-art web task completion

Created 11 months ago

365 stars

Top 77.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jason Huggins

Creator of Selenium

Project Summary

Meka Agent is an open-source, autonomous agent designed for state-of-the-art web browsing and computer interaction. It aims to mimic human-like interaction by relying purely on visual input and operating within a full computer context, making it suitable for researchers and developers building complex automation workflows.

How It Works

Meka Agent utilizes a vision-centric approach, processing visual information from the computer environment to understand and act. It supports a flexible architecture, allowing users to integrate various Large Language Models (LLMs) with strong visual grounding (e.g., OpenAI o3, Claude Sonnet 4, Claude Opus 4) and infrastructure providers that offer OS-level controls beyond browser screenshots. This OS-level access is crucial for interacting with elements like dropdowns, alerts, and file uploads, which are often rendered at the system level.

Quick Start & Requirements

Install: npm install @trymeka/core @trymeka/ai-provider-vercel @ai-sdk/openai @trymeka/computer-provider-anchor-browser playwright-core
Prerequisites: OpenAI API Key, Anchor Browser API Key.
Setup: Requires Node.js environment. Configuration involves creating a .env file with API keys.

Highlighted Details

Achieves 72.7% on the WebArena benchmark.
Supports a "Bring Your Own LLM" philosophy via Vercel's ai-sdk.
Designed for extensibility with custom tools and providers.
Written in TypeScript for a typesafe API.

Maintenance & Community

The project is open-source with a call for contributions. Links to contributing guidelines are available.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The primary infrastructure provider mentioned is Anchor Browser, suggesting potential vendor lock-in or a need for specific VM-based environments for full functionality. While other providers are welcome, extensive testing is noted for OpenAI and Claude models.

Health Check

Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days