Edit-Banana  by BIT-DataLab

Static diagram to editable DrawIO XML converter

Created 4 weeks ago

New!

1,464 stars

Top 27.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

IMG2XML converts static diagrams (flowcharts, architecture diagrams, schematics) into editable DrawIO (mxGraph) XML files. It targets users needing rapid diagram repurposing, offering high-fidelity, one-click conversion that preserves details, logic, and styles for seamless secondary editing.

How It Works

The pipeline uses SAM 3 for advanced element segmentation, followed by iterative VLM scanning guided by LLMs (Qwen-VL/GPT-4V) for comprehensive capture. High-quality OCR (Azure, Mistral Vision fallback) extracts text and converts formulas to LaTeX. RMBG-2.0 removes backgrounds from icons/arrows. Standard shapes are recovered as native DrawIO vectors, while arrows are preserved as transparent images for visual fidelity.

Quick Start & Requirements

  • Prerequisites: Python 3.10+, Node.js & npm, CUDA GPU (recommended).
  • Installation: Clone repo, pip install -r requirements.txt, cd frontend && npm install. Manually create input/, output/, sam3_output/ directories. Download and place RMBG-2.0 (models/rmbg/model.onnx) and SAM 3 (models/sam3.pt) model weights.
  • Configuration: Copy config.yaml.example to config.yaml and set environment variables (e.g., Azure credentials).
  • Usage:
    • Web: python server.py (backend), cd frontend && npm run dev (frontend). Access at http://localhost:5173.
    • CLI: python scripts/run_all.py --image input/test_diagram.png.
  • Demo: https://db121-img2xml.cn/.

Highlighted Details

  • High-Fidelity Restoration: Preserves layout, color, hierarchy, stroke/fill, and arrow styles (dashed, curved) with 1:1 accuracy.
  • Editable Elements: All recovered elements are independently selectable and modifiable within DrawIO.
  • Advanced AI Pipeline: SAM 3 segmentation and iterative VLM scanning with LLMs ensure robust recognition.
  • Accurate Text & Formula Handling: High-quality OCR and Mistral Vision/MLLM for text and LaTeX formula conversion.
  • Smart Background Removal: RMBG-2.0 for clean icon/arrow layering.

Maintenance & Community

Lists contributors and provides contribution guidelines. Roadmap includes "Intelligent Arrow Connection" (in development), "DrawIO Template Adaptation," "Batch Export Optimization," and "Local LLM Adaptation" (all planned). Issues for bugs, Discussions for suggestions.

Licensing & Compatibility

Apache License 2.0 permits commercial use, modification, and distribution with copyright notice retention. Compatible with closed-source projects.

Limitations & Caveats

"Intelligent Arrow Connection" is in development. Local VLM deployment support is planned.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
14
Star History
1,481 stars in the last 29 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

0%
2k
Training-free paradigm for text-to-image generation/editing
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.