DeepDiagram  by twwch

Agentic AI for instant diagram generation

Created 1 month ago
321 stars

Top 84.7% on SourcePulse

GitHubView on GitHub
Project Summary

DeepDiagram AI is an open-source platform that transforms natural language prompts and image inputs into various professional diagrams, including Mind Maps, Mermaid syntax, Echarts, Draw.io files, and infographics. It targets users needing rapid, structured visualization of ideas and data, offering an intelligent, agent-driven approach to diagram generation. The platform aims to streamline the creation process, making complex visual outputs accessible through intuitive AI interaction.

How It Works

DeepDiagram AI utilizes a multi-agent architecture where specialized AI agents are responsible for different visualization domains. An intelligent router, powered by a ReAct-based orchestration layer (LangGraph), analyzes user intent and routes requests to the appropriate agent. This approach allows for domain-specific expertise in generating outputs ranging from interactive mind maps to complex data charts and technical drawings. The system supports multimodal inputs, including image uploads for digitization, and employs Server-Sent Events (SSE) for live, real-time preview updates streamed from the FastAPI backend to the React frontend.

Quick Start & Requirements

  • Prerequisites: Python 3.10+, Node.js v20+, Docker & Docker Compose (recommended).
  • Setup:
    • Development: Clone the repository, navigate to backend, run uv sync and start_backend.sh. Then, navigate to frontend, run npm install, and npm run dev. Access via http://localhost:5173.
    • Docker: Create a .env file in the root directory with necessary API keys (e.g., OPENAI_API_KEY, DEEPSEEK_API_KEY). Run docker-compose up -d. Access via http://localhost.
  • Demo: Available at http://121.4.104.214:81/.

Highlighted Details

  • Specialized Agents: Supports Mind Map (mind-elixir), Flowchart (React Flow), Data Chart (Apache ECharts), Draw.io, Mermaid.js, and Infographic (AntV Infographic) generation.
  • Multimodal Input: Accepts image uploads (e.g., sketches, whiteboard photos) for digitization and context.
  • Session Management: Features persistent history, branching, and versioning for exploring different visualization paths and restoring states.
  • UI Enhancements: Includes a modern chat input, stable/resizable panels, and contextual process trace actions for debugging and refinement.
  • Tech Stack: React 19 frontend with Vite, FastAPI backend orchestrated by LangGraph, and PostgreSQL for storage.

Maintenance & Community

The project roadmap indicates significant feature completion, including core agents, session management, and UI improvements. Specific community channels (e.g., Discord, Slack) or active contributor details are not explicitly listed in the provided README.

Licensing & Compatibility

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This strong copyleft license requires that any derivative works distributed over a network must also be made available under the AGPL-3.0. This may impose restrictions on commercial use or integration into closed-source applications.

Limitations & Caveats

Current multimodal support is primarily focused on image uploads; extended support for formats like PDF or Docx for context parsing is listed as a future development item. The AGPL-3.0 license necessitates careful consideration for commercial adoption or integration into proprietary software due to its strict copyleft provisions.

Health Check
Last Commit

19 hours ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
5
Star History
323 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

rivet by Ironclad

0.2%
4k
Visual IDE for AI agent and prompt-chaining development
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.