agent  by trymeka

Autonomous browsing agent for state-of-the-art web task completion

Created 2 months ago
348 stars

Top 79.8% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Meka Agent is an open-source, autonomous agent designed for state-of-the-art web browsing and computer interaction. It aims to mimic human-like interaction by relying purely on visual input and operating within a full computer context, making it suitable for researchers and developers building complex automation workflows.

How It Works

Meka Agent utilizes a vision-centric approach, processing visual information from the computer environment to understand and act. It supports a flexible architecture, allowing users to integrate various Large Language Models (LLMs) with strong visual grounding (e.g., OpenAI o3, Claude Sonnet 4, Claude Opus 4) and infrastructure providers that offer OS-level controls beyond browser screenshots. This OS-level access is crucial for interacting with elements like dropdowns, alerts, and file uploads, which are often rendered at the system level.

Quick Start & Requirements

  • Install: npm install @trymeka/core @trymeka/ai-provider-vercel @ai-sdk/openai @trymeka/computer-provider-anchor-browser playwright-core
  • Prerequisites: OpenAI API Key, Anchor Browser API Key.
  • Setup: Requires Node.js environment. Configuration involves creating a .env file with API keys.

Highlighted Details

  • Achieves 72.7% on the WebArena benchmark.
  • Supports a "Bring Your Own LLM" philosophy via Vercel's ai-sdk.
  • Designed for extensibility with custom tools and providers.
  • Written in TypeScript for a typesafe API.

Maintenance & Community

The project is open-source with a call for contributions. Links to contributing guidelines are available.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The primary infrastructure provider mentioned is Anchor Browser, suggesting potential vendor lock-in or a need for specific VM-based environments for full functionality. While other providers are welcome, extensive testing is noted for OpenAI and Claude models.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
14
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Wes McKinney Wes McKinney(Author of Pandas), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
22 more.

autogen by microsoft

0.4%
51k
Agentic framework for multi-agent AI applications
Created 2 years ago
Updated 1 week ago
Starred by Michael Han Michael Han(Cofounder of Unsloth), Kevin Hou Kevin Hou(Head of Product Engineering at Windsurf), and
31 more.

browser-use by browser-use

0.5%
71k
SDK for AI agent browser control
Created 11 months ago
Updated 6 hours ago
Feedback? Help us improve.