DeepDiagram by twwch

Agentic AI for instant diagram generation

Created 2 months ago

1,098 stars

Top 34.5% on SourcePulse

Project Summary

DeepDiagram AI is an open-source platform that transforms natural language prompts and image inputs into various professional diagrams, including Mind Maps, Mermaid syntax, Echarts, Draw.io files, and infographics. It targets users needing rapid, structured visualization of ideas and data, offering an intelligent, agent-driven approach to diagram generation. The platform aims to streamline the creation process, making complex visual outputs accessible through intuitive AI interaction.

How It Works

DeepDiagram AI utilizes a multi-agent architecture where specialized AI agents are responsible for different visualization domains. An intelligent router, powered by a ReAct-based orchestration layer (LangGraph), analyzes user intent and routes requests to the appropriate agent. This approach allows for domain-specific expertise in generating outputs ranging from interactive mind maps to complex data charts and technical drawings. The system supports multimodal inputs, including image uploads for digitization, and employs Server-Sent Events (SSE) for live, real-time preview updates streamed from the FastAPI backend to the React frontend.

Quick Start & Requirements

Prerequisites: Python 3.10+, Node.js v20+, Docker & Docker Compose (recommended).
Setup:
- Development: Clone the repository, navigate to backend, run uv sync and start_backend.sh. Then, navigate to frontend, run npm install, and npm run dev. Access via http://localhost:5173.
- Docker: Create a .env file in the root directory with necessary API keys (e.g., OPENAI_API_KEY, DEEPSEEK_API_KEY). Run docker-compose up -d. Access via http://localhost.
Demo: Available at http://121.4.104.214:81/.

Highlighted Details

Specialized Agents: Supports Mind Map (mind-elixir), Flowchart (React Flow), Data Chart (Apache ECharts), Draw.io, Mermaid.js, and Infographic (AntV Infographic) generation.
Multimodal Input: Accepts image uploads (e.g., sketches, whiteboard photos) for digitization and context.
Session Management: Features persistent history, branching, and versioning for exploring different visualization paths and restoring states.
UI Enhancements: Includes a modern chat input, stable/resizable panels, and contextual process trace actions for debugging and refinement.
Tech Stack: React 19 frontend with Vite, FastAPI backend orchestrated by LangGraph, and PostgreSQL for storage.

Maintenance & Community

The project roadmap indicates significant feature completion, including core agents, session management, and UI improvements. Specific community channels (e.g., Discord, Slack) or active contributor details are not explicitly listed in the provided README.

Licensing & Compatibility

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This strong copyleft license requires that any derivative works distributed over a network must also be made available under the AGPL-3.0. This may impose restrictions on commercial use or integration into closed-source applications.

Limitations & Caveats

Current multimodal support is primarily focused on image uploads; extended support for formats like PDF or Docx for context parsing is listed as a future development item. The AGPL-3.0 license necessitates careful consideration for commercial adoption or integration into proprietary software due to its strict copyleft provisions.

DeepDiagram by twwch

Explore Similar Projects

Pretty-mermaid-skills by imxv

VisualSketchpad by Yushi-Hu

omnichain by zenoverflow

Ai-Markmap by kongkongyo

voicetree by voicetreelab

system-design-visualizer by mallahyari

smart-draw by liujuntao123

ChatTutor by HugeCatLab

Magick by Oneirocom

node-banana by shrimbly

refly by refly-ai

next-ai-draw-io by DayuanJiang