omni-bot-sdk-oss  by weixin-omni

WeChat RPA framework with visual recognition

Created 2 months ago
317 stars

Top 85.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a WeChat RPA framework for building chat robots, integrating with LLMs like OpenAI and Dify. It targets developers and researchers interested in automating WeChat interactions without direct API access, offering a zero-intrusion approach via visual recognition.

How It Works

The framework employs a unique visual recognition strategy using custom YOLO models and OCR to interact with the WeChat client. This "zero-intrusion" method avoids direct hooking or reverse engineering of WeChat's internal APIs, aiming to reduce detection risks. Messages are captured via database polling with minimal latency, and actions are queued for an RPA executor that simulates user input.

Quick Start & Requirements

  • Installation: pip install omni-bot-sdk (requires Python 3.12).
  • Prerequisites: A separate tool (DbkeyHookCMD.exe or DbkeyHookUI.exe) is needed to obtain the WeChat database key. An MQTT service (e.g., nanomq) is required for message forwarding.
  • Setup: Requires obtaining a database key and configuring config.yaml. Users are advised not to operate the mouse/keyboard during RPA execution.
  • Documentation: Plugin documentation and examples are available in linked repositories.

Highlighted Details

  • Zero-latency message reception via database listening.
  • Runtime zero-intrusion using YOLO and OCR, minimizing detection.
  • Plugin architecture for extensibility, supporting custom actions like sending Moments and Mini Programs.
  • Theoretically supports the latest WeChat versions due to its visual recognition approach.

Maintenance & Community

The project is actively maintained, with a roadmap indicating future improvements like enhanced image parsing and robustness. Community interaction is encouraged via developer discussion groups.

Licensing & Compatibility

Licensed under GPL-V3. This license may impose copyleft restrictions, requiring derivative works to also be open-sourced under GPL-V3, which could affect commercial or closed-source integration.

Limitations & Caveats

The RPA approach relies on visual recognition, which is not 100% accurate and can be affected by UI changes or similar window titles. RPA operations are linear and can conflict with user input. The database polling mechanism might be detectable by security software. Message sending requires precise contact identification, failing with duplicate names.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
32 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.