SVGDreamer  by ximinng

Research paper for text-guided SVG generation using diffusion

Created 2 years ago
437 stars

Top 68.0% on SourcePulse

GitHubView on GitHub
Project Summary

SVGDreamer is a CVPR 2024 paper implementing a diffusion-based approach for text-guided SVG generation. It targets researchers and artists seeking to synthesize high-quality vector graphics from textual descriptions, offering control over style and editing capabilities.

How It Works

SVGDreamer utilizes a diffusion model to generate SVG paths. It employs a two-stage process: first, a Sketch-Inference-and-Vector-Editing (SIVE) stage for initial shape generation and refinement, followed by a Vector-SVG-Path-Diffusion (VPSD) stage to produce the final SVG output. This approach aims to balance synthesis quality with vector graphic editing potential.

Quick Start & Requirements

  • Installation: Run bash script/install.sh or use the provided Docker script bash script/run_svgdreamer_docker.sh.
  • Prerequisites: Requires a pretrained Stable Diffusion model (e.g., Stable Diffusion 2.1 Base). The model can be auto-downloaded by setting diffuser.download=True in conf/config.yaml.
  • Resources: enable_xformers=True is recommended for faster optimization. state.mprec='fp16' can reduce GPU memory usage.
  • Documentation: Examples.md

Highlighted Details

  • Supports multiple generation styles including iconography, painting, pixel art, low-poly, sketch, and ink/wash.
  • Offers control over generation through various parameters like skip_sive, token_ind, result_path, and x.vpsd.t_schedule.
  • Includes a newer version, SVGDreamer++, with enhanced visual representation and editing capabilities.

Maintenance & Community

The project is associated with the CVPR 2024 paper. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The README mentions a "TODO" list, indicating ongoing development. Specific limitations or known bugs are not detailed.

Health Check
Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

0%
2k
Training-free paradigm for text-to-image generation/editing
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.