omni-bot-sdk-oss by weixin-omni

WeChat RPA framework with visual recognition

Created 6 months ago

420 stars

Top 69.9% on SourcePulse

Project Summary

This project provides a WeChat RPA framework for building chat robots, integrating with LLMs like OpenAI and Dify. It targets developers and researchers interested in automating WeChat interactions without direct API access, offering a zero-intrusion approach via visual recognition.

How It Works

The framework employs a unique visual recognition strategy using custom YOLO models and OCR to interact with the WeChat client. This "zero-intrusion" method avoids direct hooking or reverse engineering of WeChat's internal APIs, aiming to reduce detection risks. Messages are captured via database polling with minimal latency, and actions are queued for an RPA executor that simulates user input.

Quick Start & Requirements

Installation: pip install omni-bot-sdk (requires Python 3.12).
Prerequisites: A separate tool (DbkeyHookCMD.exe or DbkeyHookUI.exe) is needed to obtain the WeChat database key. An MQTT service (e.g., nanomq) is required for message forwarding.
Setup: Requires obtaining a database key and configuring config.yaml. Users are advised not to operate the mouse/keyboard during RPA execution.
Documentation: Plugin documentation and examples are available in linked repositories.

Highlighted Details

Zero-latency message reception via database listening.
Runtime zero-intrusion using YOLO and OCR, minimizing detection.
Plugin architecture for extensibility, supporting custom actions like sending Moments and Mini Programs.
Theoretically supports the latest WeChat versions due to its visual recognition approach.

Maintenance & Community

The project is actively maintained, with a roadmap indicating future improvements like enhanced image parsing and robustness. Community interaction is encouraged via developer discussion groups.

Licensing & Compatibility

Licensed under GPL-V3. This license may impose copyleft restrictions, requiring derivative works to also be open-sourced under GPL-V3, which could affect commercial or closed-source integration.

Limitations & Caveats

The RPA approach relies on visual recognition, which is not 100% accurate and can be affected by UI changes or similar window titles. RPA operations are linear and can conflict with user input. The database polling mechanism might be detectable by security software. Message sending requires precise contact identification, failing with duplicate names.

omni-bot-sdk-oss by weixin-omni

Explore Similar Projects

ChatGptNet by marcominerva

GPTPortal by Zaki-1052

com.openai.unity by RageAgainstThePixel

chatgpt-wechat by whyiyhw

Master-AI-BOT by yesbhautik

Openaibot by LlmKira

tgpt by aandrew-me

flutter_chat_ui by flyerhq

wechat-bot by wangrongding

AstrBot by AstrBotDevs

kirara-ai by lss233

LibreChat by danny-avila