disco-diffusion  by alembics

AI art and animation notebook

created 3 years ago
7,452 stars

Top 7.1% on sourcepulse

GitHubView on GitHub
Project Summary

Disco Diffusion is a comprehensive suite of tools for generating AI art and animations, targeting artists and researchers interested in advanced diffusion models. It offers a flexible, notebook-based environment for creating complex visual outputs through text-to-image and image-to-image generation, including animation and 3D effects.

How It Works

The project is built upon Katherine Crowson's diffusion models and OpenAI's CLIP, enabling text-guided image synthesis. It incorporates various techniques and models from the community, such as advanced cutout methods, SLIP models, and depth estimation for 3D animation. The architecture supports multiple CLIP models for enhanced prompt accuracy and offers features like zooming, panning, rotation, and keyframing for dynamic visual sequences.

Quick Start & Requirements

  • Installation typically involves cloning the repository and running Python scripts or notebooks.
  • Requires Python, PyTorch, and potentially CUDA for GPU acceleration. Specific model downloads may be necessary.
  • The project utilizes colab-convert for managing Python scripts and Jupyter notebooks, facilitating development.

Highlighted Details

  • Supports 3D animation using depth estimation models (AdaBins, MiDaS) and pytorch3d.
  • Includes features like diffusion zooming, panning, rotation, and keyframing for animation.
  • Offers advanced techniques such as horizontal/vertical symmetry and warp mode leveraging optical flow.
  • Integrates various specialized models like OpenCLIP, Pixel Art Diffusion, and portrait generators.

Maintenance & Community

The project has seen significant contributions from various individuals and has evolved through multiple versions, indicating active development. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README mentions an "added license that somehow went missing," but does not specify the license type or any restrictions. This lack of clarity may impact commercial use or integration into closed-source projects.

Limitations & Caveats

The project is described as a "frankensteinian amalgamation," suggesting potential for instability or integration issues. Some features, like SLIP models, have been removed due to import conflicts. The ViT-L/14@336px model requires high VRAM. The license status is unclear.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.