DiffSynth-Studio by modelscope

Open-source project for diffusion model exploration

Created 2 years ago

11,401 stars

Top 4.4% on SourcePulse

View on GitHub

3 Experts Love This Project

Yineng Zhang

Inference Lead at SGLang; Research Scientist at Together AI

Rodrigo Nader

Cofounder of Langflow

Pietro Schirano

Founder of MagicPath

Project Summary

This project provides a research-focused platform for exploring cutting-edge diffusion models, particularly for video synthesis and image generation. It targets academic researchers and developers seeking to integrate and experiment with a wide array of state-of-the-art diffusion models, offering novel inference capabilities and a flexible framework for innovation.

How It Works

DiffSynth Studio integrates numerous open-source diffusion models, including FLUX, Wan-Video, CogVideoX, and Stable Diffusion variants. It supports advanced techniques like ControlNet for fine-grained control, LoRA for efficient fine-tuning, and specialized pipelines for tasks such as video editing, stylization, and toon shading. The project emphasizes aggressive technological exploration, enabling users to combine different models and techniques for novel applications.

Quick Start & Requirements

Install: pip install -e . (recommended from source) or pip install diffsynth (PyPI, may lag).
Prerequisites: torch, sentencepiece, cmake, cupy.
Models: Download pre-set models via diffsynth.download_models or custom models from ModelScope/HuggingFace.
Docs: https://diffsynth-studio.readthedocs.io/zh-cn/latest/index.html

Highlighted Details

Supports a broad range of video synthesis models (e.g., Wan-Video, HunyuanVideo, CogVideoX, ExVideo) and image models (e.g., FLUX, Kolors, Stable Diffusion 3).
Features advanced capabilities like ControlNet compatibility for complex image generation, extended video generation (up to 128 frames), and toon shading.
Includes research contributions like EliGen for entity-level control and ArtAug for aesthetic enhancement.
Offers both Python API and WebUI (Gradio/Streamlit) for usage.

Maintenance & Community

The project has transitioned to ModelScope and is actively maintained. It has released numerous updates and research papers, indicating ongoing development. Links to demos and model repositories are provided.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is primarily targeted at academic exploration and may require significant technical expertise to set up and utilize effectively. The PyPI installation may not always reflect the latest features.

Health Check

Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

365 stars in the last 30 days