Cognitive GUI automation engine for web interactions and data extraction
Top 91.9% on sourcepulse
AutoNode is a neuro-graphic, self-learnable engine for cognitive GUI automation, targeting developers and researchers who need to automate web interactions and data extraction. It offers a programmatic approach to web navigation and data retrieval by combining OCR, object detection (YOLO), and a custom site-graph representation of web pages.
How It Works
AutoNode operates by interpreting a user-defined site-graph (JSON) that maps out web page elements and their relationships. It uses YOLO models for object detection to identify interactive elements and OCR for text recognition, enabling dynamic interaction with web interfaces. This approach allows for programmatic control over web automation tasks, moving beyond simple scraping to complex, context-aware interactions.
Quick Start & Requirements
.env.example
to .env
for each module, and run docker compose -f docker-compose.yaml up --build
.http://localhost:8001/docs
./api/autonode/initiate
endpoint with a JSON payload specifying site_url
, objective
, graph_path
, and planner_prompt
.Highlighted Details
Maintenance & Community
The project is hosted on GitHub under the TransformerOptimus organization. Links to community resources like Discord or Slack are not explicitly provided in the README.
Licensing & Compatibility
The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.
Limitations & Caveats
The README implies that custom YOLO model training is a manual process. While remote hosting is supported, it requires separate server setup. The site-graph creation is a manual JSON definition process, which can be labor-intensive for complex websites.
1 year ago
1 day