SVGDreamer  by ximinng

Research paper for text-guided SVG generation using diffusion

Created 1 year ago
399 stars

Top 72.4% on SourcePulse

GitHubView on GitHub
Project Summary

SVGDreamer is a CVPR 2024 paper implementing a diffusion-based approach for text-guided SVG generation. It targets researchers and artists seeking to synthesize high-quality vector graphics from textual descriptions, offering control over style and editing capabilities.

How It Works

SVGDreamer utilizes a diffusion model to generate SVG paths. It employs a two-stage process: first, a Sketch-Inference-and-Vector-Editing (SIVE) stage for initial shape generation and refinement, followed by a Vector-SVG-Path-Diffusion (VPSD) stage to produce the final SVG output. This approach aims to balance synthesis quality with vector graphic editing potential.

Quick Start & Requirements

  • Installation: Run bash script/install.sh or use the provided Docker script bash script/run_svgdreamer_docker.sh.
  • Prerequisites: Requires a pretrained Stable Diffusion model (e.g., Stable Diffusion 2.1 Base). The model can be auto-downloaded by setting diffuser.download=True in conf/config.yaml.
  • Resources: enable_xformers=True is recommended for faster optimization. state.mprec='fp16' can reduce GPU memory usage.
  • Documentation: Examples.md

Highlighted Details

  • Supports multiple generation styles including iconography, painting, pixel art, low-poly, sketch, and ink/wash.
  • Offers control over generation through various parameters like skip_sive, token_ind, result_path, and x.vpsd.t_schedule.
  • Includes a newer version, SVGDreamer++, with enhanced visual representation and editing capabilities.

Maintenance & Community

The project is associated with the CVPR 2024 paper. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The README mentions a "TODO" list, indicating ongoing development. Specific limitations or known bugs are not detailed.

Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
1 more.

Sana by NVlabs

0.4%
4k
Image synthesis research paper using a linear diffusion transformer
Created 11 months ago
Updated 5 days ago
Feedback? Help us improve.