DepthCrafter by Tencent

Depth estimation for open-world videos (CVPR 2025 Highlight)

Created 1 year ago

1,517 stars

Top 26.9% on SourcePulse

Project Summary

DepthCrafter generates temporally consistent, high-fidelity depth sequences for open-world videos without requiring camera poses or optical flow. It is targeted at researchers and developers in computer vision and VFX, offering improved quality and speed over existing methods.

How It Works

DepthCrafter employs a novel approach to produce long, consistent depth sequences by leveraging a diffusion model trained on extensive video data. This method inherently handles temporal coherence and fine-grained details, eliminating the need for explicit motion estimation or camera pose information, which simplifies the pipeline and broadens applicability.

Quick Start & Requirements

Install: git clone https://github.com/Tencent/DepthCrafter.git followed by pip install -r requirements.txt.
Prerequisites: GPU with ~26GB memory for high-resolution (1024x576) inference, or ~9GB for lower resolutions (512x256).
Demo: Gradio demo available online, or run locally with gradio app.py.
Resources: Project page for visualizations.

Highlighted Details

Selected as a Highlight at CVPR '25.
Achieves state-of-the-art performance on multiple benchmarks (Sintel, ScanNet, KITTI, Bonn), outperforming Depth-Anything-V2 and Marigold in AbsRel and δ₁ metrics.
Supports EXR output format.
Integrated into Nuke and ComfyUI for professional VFX and creative workflows.
Offers a Hugging Face online demo.

Maintenance & Community

Actively under development with v1.0.1 released for improved quality and speed.
Community support via GitHub issues. Related nodes for Nuke and ComfyUI are available.
Business inquiries can be directed to wbhu@tencent.com.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing for commercial use or closed-source integration.

Limitations & Caveats

The project is still under active development, indicating potential for breaking changes.
Communication is recommended in English for broader community support.
No explicit license is provided, which may pose a barrier for commercial adoption.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

1

Star History

17 stars in the last 30 days

Explore Similar Projects

DepthLM_Official by facebookresearch

Vision Language Models for metric depth estimation

Created 4 months ago

Updated 2 weeks ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

VideoTuna by VideoVerses

Codebase for text-to-video applications

Created 1 year ago

Updated 5 months ago

FlashPortrait by Francis-Rings

Faster infinite portrait animation

Created 2 months ago

Updated 5 days ago

SteadyDancer by MCG-NJU

AI framework for harmonized human image animation

Created 3 months ago

Updated 2 months ago

ComfyUI-DyPE by wildminder

DyPE for FLUX: Artifact-free 4K+ image generation via ComfyUI

Created 4 months ago

Updated 2 months ago

LTX-Video-Trainer by Lightricks

Video model training and fine-tuning toolkit

Created 11 months ago

Updated 1 month ago

Starred by

Jesse Clark

Jesse Clark(Cofounder of Marqo) and

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

RADIO by NVlabs

Vision foundation model for distilling large models

Created 2 years ago

Updated 2 weeks ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

OpenLRM by 3DTopia

Open-source implementation of Large Reconstruction Models

Created 2 years ago

Updated 1 year ago

VBench by Vchitect

Benchmark suite for video generation models

Created 2 years ago

Updated 2 days ago

ComfyUI-WanVideoWrapper by kijai

ComfyUI nodes for advanced video generation

Created 1 year ago

Updated 3 days ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI),

Ying Sheng

Ying Sheng(Coauthor of SGLang), and

5 more.

Open-Sora-Plan by PKU-YuanGroup

Open-source project aiming to reproduce Sora-like T2V model

Created 2 years ago

Updated 4 months ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n).

Wan2.2 by Wan-Video

Advanced video generation models with MoE architecture

Created 7 months ago

Updated 2 months ago

Feedback? Help us improve.