Discover and explore top open-source AI tools and projects—updated daily.
WeChat RPA framework with visual recognition
Top 85.2% on SourcePulse
This project provides a WeChat RPA framework for building chat robots, integrating with LLMs like OpenAI and Dify. It targets developers and researchers interested in automating WeChat interactions without direct API access, offering a zero-intrusion approach via visual recognition.
How It Works
The framework employs a unique visual recognition strategy using custom YOLO models and OCR to interact with the WeChat client. This "zero-intrusion" method avoids direct hooking or reverse engineering of WeChat's internal APIs, aiming to reduce detection risks. Messages are captured via database polling with minimal latency, and actions are queued for an RPA executor that simulates user input.
Quick Start & Requirements
pip install omni-bot-sdk
(requires Python 3.12).DbkeyHookCMD.exe
or DbkeyHookUI.exe
) is needed to obtain the WeChat database key. An MQTT service (e.g., nanomq
) is required for message forwarding.config.yaml
. Users are advised not to operate the mouse/keyboard during RPA execution.Highlighted Details
Maintenance & Community
The project is actively maintained, with a roadmap indicating future improvements like enhanced image parsing and robustness. Community interaction is encouraged via developer discussion groups.
Licensing & Compatibility
Licensed under GPL-V3. This license may impose copyleft restrictions, requiring derivative works to also be open-sourced under GPL-V3, which could affect commercial or closed-source integration.
Limitations & Caveats
The RPA approach relies on visual recognition, which is not 100% accurate and can be affected by UI changes or similar window titles. RPA operations are linear and can conflict with user input. The database polling mechanism might be detectable by security software. Message sending requires precise contact identification, failing with duplicate names.
1 month ago
Inactive