Discover and explore top open-source AI tools and projects—updated daily.
microsoftAgentic model for visual computer task automation
Top 28.2% on SourcePulse
Fara-7B is Microsoft's compact, 7-billion-parameter agentic Small Language Model (SLM) engineered for computer use. It automates complex web-based tasks by directly interacting with computer interfaces, offering an efficient and privacy-preserving alternative to larger, more resource-intensive agent systems. Its primary benefit lies in achieving state-of-the-art performance within its size class, enabling on-device deployment and significantly reducing task completion steps.
How It Works
Fara-7B operates visually, perceiving webpages and executing actions like scrolling, typing, and clicking at predicted coordinates, mimicking human interaction without relying on accessibility trees. This approach is enabled by its foundation on Qwen2.5-VL-7B and training via a novel synthetic data pipeline using the Magentic-One multi-agent framework. This methodology allows for efficient task completion, averaging approximately 16 steps per task, compared to over 40 steps for comparable models.
Quick Start & Requirements
Installation involves cloning the repository, setting up a Python virtual environment, and installing dependencies via pip install -e . and playwright install. For hosting, Azure Foundry is recommended for a serverless experience without local GPU requirements. Alternatively, self-hosting is possible using vLLM (vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto) on machines with sufficient GPU VRAM. Running tasks is done via fara-cli --task "...". Links: Model, Dataset, Azure Foundry.
Highlighted Details
Maintenance & Community
This is an experimental release aimed at community exploration and feedback. While specific community channels (like Discord/Slack) are not detailed, the project collaborates with BrowserBase for task annotation.
Licensing & Compatibility
The specific license is not detailed in the provided README. Users should verify compatibility for commercial use or integration into closed-source projects.
Limitations & Caveats
Fara-7B is an experimental release, and users are advised to run it in sandboxed environments, monitor its execution, and avoid sensitive data or high-risk domains. Reproducing results on live websites presents inherent challenges due to dynamic web content, although the project implements measures like BrowserBase integration and task updates to mitigate this.
2 days ago
Inactive
gptme
Fosowl