stable-diffusion-webui-depthmap-script by thygate

WebUI script for depth map creation and manipulation

Created 3 years ago

1,841 stars

Top 23.3% on SourcePulse

1 Expert Loves This Project

jiamings

Chief Scientist at Luma AI

Project Summary

This addon for AUTOMATIC1111's Stable Diffusion WebUI generates high-resolution depth maps, normal maps, and 3D stereo image pairs from images. It targets users of Stable Diffusion WebUI and 3D content creators, enabling advanced image manipulation and 3D asset generation.

How It Works

The script leverages multiple state-of-the-art monocular depth estimation models, including Marigold, MiDaS, and ZoeDepth, to produce realistic depth maps. It employs multi-resolution merging (BoostingMonocularDepth) for enhanced detail and resolution. For 3D mesh generation, it utilizes Context-aware Layered Depth Inpainting, allowing for video rendering from the generated 3D scene.

Quick Start & Requirements

Installation: Install as an extension within Stable Diffusion WebUI (Extensions tab -> Available -> Load from -> install Depth Maps) or from the URL: https://github.com/thygate/stable-diffusion-webui-depthmap-script.
Dependencies: Requires Stable Diffusion WebUI. Models are downloaded automatically. GPU acceleration is supported and recommended for performance.
Standalone: Clone the repo, install requirements.txt, and run main.py.
Documentation: Wiki

Highlighted Details

Supports depth map generation using Marigold, MiDaS (multiple variants), ZoeDepth, and LeReS.
Enables creation of 3D stereo image pairs (side-by-side, anaglyph) and normal maps.
Integrates with Rembg for background removal and supports video processing.
Generates 3D meshes using 3D-Photo-Inpainting for video rendering.

Maintenance & Community

Actively developed with contributions from multiple users.
Community interaction via GitHub Issues and Discussions.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for underlying models and code.

Limitations & Caveats

Stereoscopic image generation is CPU-bound.
3D mesh generation can be time-consuming, potentially taking up to an hour for large images.
The README mentions potential VRAM usage issues with certain models and the "Boost" feature, recommending parameter adjustments for lower VRAM GPUs.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

4

Star History

5 stars in the last 30 days

Explore Similar Projects

3DIS by limuloo

Research paper for text-to-image generation via depth-driven instance synthesis

Created 1 year ago

Updated 7 months ago

prope by liruilong940607

3D vision transformer positional encoding

Created 8 months ago

Updated 3 weeks ago

image-sculpting by vision-x-nyu

Image editing framework using 3D geometry

Created 2 years ago

Updated 1 year ago

video2game by video2game

Code release for real-time interactive environment creation from video

Created 1 year ago

Updated 1 year ago

LayoutGPT by weixi-feng

Research paper for visual planning & generation using LLMs

Created 2 years ago

Updated 1 year ago

WonderWorld by KovenYu

Interactive 3D scene generation from a single image

Created 1 year ago

Updated 9 months ago

sd-webui-image-sequence-toolkit by OedoSoldier

WebUI extension for image sequence batch processing and inpainting

Created 3 years ago

Updated 2 years ago

Starred by

Noah Snavely

Noah Snavely(Research Scientist at Google DeepMind; Professor at Cornell Tech).

WonderJourney by KovenYu

AI research paper for generating scene videos with camera movement

Created 2 years ago

Updated 1 year ago

DepthCrafter by Tencent

Depth estimation for open-world videos (CVPR 2025 Highlight)

Created 1 year ago

Updated 1 month ago

aphantasia by eps696

Text-to-image/video tool using CLIP and FFT/DWT/RGB, no GANs

Created 4 years ago

Updated 11 months ago

Starred by

Robin Huang

Robin Huang(Cofounder of Comfy Org) and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

ComfyUI-3D-Pack by MrForExample

ComfyUI node suite for 3D asset processing via cutting-edge algorithms

Created 2 years ago

Updated 1 week ago

Starred by

Chenlin Meng

Chenlin Meng(Cofounder of Pika),

Simon Willison

Simon Willison(Coauthor of Django), and

8 more.

dream-textures by carson-katri

Blender add-on for Stable Diffusion integration

Created 3 years ago

Updated 1 year ago

Feedback? Help us improve.