AutoNode  by TransformerOptimus

Cognitive GUI automation engine for web interactions and data extraction

created 1 year ago
289 stars

Top 91.9% on sourcepulse

GitHubView on GitHub
Project Summary

AutoNode is a neuro-graphic, self-learnable engine for cognitive GUI automation, targeting developers and researchers who need to automate web interactions and data extraction. It offers a programmatic approach to web navigation and data retrieval by combining OCR, object detection (YOLO), and a custom site-graph representation of web pages.

How It Works

AutoNode operates by interpreting a user-defined site-graph (JSON) that maps out web page elements and their relationships. It uses YOLO models for object detection to identify interactive elements and OCR for text recognition, enabling dynamic interaction with web interfaces. This approach allows for programmatic control over web automation tasks, moving beyond simple scraping to complex, context-aware interactions.

Quick Start & Requirements

  • Installation: Clone the repository, copy .env.example to .env for each module, and run docker compose -f docker-compose.yaml up --build.
  • Prerequisites: Python, Docker.
  • Access: API documentation available at http://localhost:8001/docs.
  • Initiation: Use the /api/autonode/initiate endpoint with a JSON payload specifying site_url, objective, graph_path, and planner_prompt.

Highlighted Details

  • Leverages YOLOv8 models trained on web screenshots for object detection.
  • Supports remote hosting of YOLO and OCR modules for resource-constrained environments.
  • Site-graphs are JSON files defining nodes (elements) and edges (navigation flow).
  • Offers options for local or AWS S3 storage of debugging screenshots and downloaded output.

Maintenance & Community

The project is hosted on GitHub under the TransformerOptimus organization. Links to community resources like Discord or Slack are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README implies that custom YOLO model training is a manual process. While remote hosting is supported, it requires separate server setup. The site-graph creation is a manual JSON definition process, which can be labor-intensive for complex websites.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

firecrawl by mendableai

2.1%
44k
API service for turning websites into LLM-ready data
created 1 year ago
updated 20 hours ago
Feedback? Help us improve.