AutoNode by TransformerOptimus

Cognitive GUI automation engine for web interactions and data extraction

Created 1 year ago

291 stars

Top 90.7% on SourcePulse

Project Summary

AutoNode is a neuro-graphic, self-learnable engine for cognitive GUI automation, targeting developers and researchers who need to automate web interactions and data extraction. It offers a programmatic approach to web navigation and data retrieval by combining OCR, object detection (YOLO), and a custom site-graph representation of web pages.

How It Works

AutoNode operates by interpreting a user-defined site-graph (JSON) that maps out web page elements and their relationships. It uses YOLO models for object detection to identify interactive elements and OCR for text recognition, enabling dynamic interaction with web interfaces. This approach allows for programmatic control over web automation tasks, moving beyond simple scraping to complex, context-aware interactions.

Quick Start & Requirements

Installation: Clone the repository, copy .env.example to .env for each module, and run docker compose -f docker-compose.yaml up --build.
Prerequisites: Python, Docker.
Access: API documentation available at http://localhost:8001/docs.
Initiation: Use the /api/autonode/initiate endpoint with a JSON payload specifying site_url, objective, graph_path, and planner_prompt.

Highlighted Details

Leverages YOLOv8 models trained on web screenshots for object detection.
Supports remote hosting of YOLO and OCR modules for resource-constrained environments.
Site-graphs are JSON files defining nodes (elements) and edges (navigation flow).
Offers options for local or AWS S3 storage of debugging screenshots and downloaded output.

Maintenance & Community

The project is hosted on GitHub under the TransformerOptimus organization. Links to community resources like Discord or Slack are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README implies that custom YOLO model training is a manual process. While remote hosting is supported, it requires separate server setup. The site-graph creation is a manual JSON definition process, which can be labor-intensive for complex websites.

AutoNode by TransformerOptimus

Explore Similar Projects

browser-agent-py by oxylabs

oxylabs-ai-studio-py by oxylabs

dendrite-python-sdk by dendrite-systems

ActGPT by ethanhe42

fuji-web by normal-computing

browserpilot by handrew

browserable by browserable

Browser4 by platonai

computer-use-preview by google-gemini

midscene by web-infra-dev

skyvern by Skyvern-AI

owl by camel-ai