cyber-doctor by Warma10032

Multi-modal AI agent for personalized health assistance

Created 1 year ago

346 stars

Top 80.5% on SourcePulse

Project Summary

"Cyber Huatuo" (Warma10032/cyber-doctor) is a multi-modal intelligent agent designed to democratize healthcare access by providing a personal doctor assistant. It leverages large language models (LLMs) and knowledge graphs to offer functionalities like preliminary disease diagnosis, medical record analysis, and professional health Q&A, aiming to bridge geographical disparities in medical resources. The project targets individuals concerned about their health and can be adapted for domain-specific expertise beyond healthcare.

How It Works

The project integrates multiple AI models, orchestrated by an AI agent, to handle complex tasks. It features a core LLM backbone enhanced with Retrieval Augmented Generation (RAG) for knowledge base and internet retrieval, and a Neo4j knowledge graph for structured domain knowledge. A dedicated voice module supports speech-to-text (STT) and text-to-speech (TTS) for an accessible conversational interface. Multi-modal capabilities include image recognition for documents and generation of images and videos.

Quick Start & Requirements

Installation: Clone the repository (git clone).
Prerequisites: Python >= 3.10 (recommended 3.10), Conda environment management recommended.
Dependencies: pip install -r requirements.txt.
Configuration: Copy .env.example to .env and fill in API keys for supported LLMs (OpenAI-compatible, ZhipuAI, Ollama, etc.). Configure config/config-web.yaml.
Execution: Run python app.py. Access the UI at http://localhost:7860.
Optional: Neo4j database for knowledge graph functionality. Requires downloading and importing a Neo4j dump file.
Links: Project Demo Video: https://www.bilibili.com/video/BV1CU2aYpEn2

Highlighted Details

Multi-modal Integration: Supports text, voice, and image inputs, with capabilities for image recognition (e.g., medical records), video generation, and document generation (PPT/Word).
Enhanced Knowledge Access: Leverages RAG with custom knowledge bases (files), internet search (web crawler), and Neo4j knowledge graphs for context-aware and up-to-date responses.
Voice Interaction: Features a dedicated voice dialogue module with STT (Whisper) and TTS (edge-tts) supporting multiple dialects, enabling usage via voice commands.

Maintenance & Community

The project lists several team members and acknowledges reference projects. Specific community channels (like Discord/Slack) or a public roadmap are not explicitly detailed in the README. Contributions via issues and PRs are encouraged for API adaptation and feature improvements.

Licensing & Compatibility

The repository includes a LICENSE file, but its specific terms are not detailed in the provided README content. Compatibility for commercial use or linking with closed-source projects would depend on the exact license terms.

Limitations & Caveats

The setup for the user-specific knowledge base management UI/backend is not well-documented. While the project integrates many features, there is stated room for optimization in individual components, such as more sophisticated knowledge graph entity and relation processing.

cyber-doctor by Warma10032

Explore Similar Projects

Multi-Agent-GPT by YangXuanyi

ChatPLUG by X-PLUG

MedRAG by SNOWTEAM2023

Dual-AI-Chat by yeahhe365

Awesome-AI-Agents-for-Healthcare by AgenticHealthAI

ai_virtual_mate_comm by MewCo-AI

RAGQnASystem by honeyandme

witsy by nbonamy

Multi-Agent-Medical-Assistant by souvikmajumder26

AI0x0.com by mushan0x0

ChatDoctor by Kent0n-Li

lobehub by lobehub