Discover and explore top open-source AI tools and projects—updated daily.
Create emotional talking portraits with diffusion models
Top 99.1% on SourcePulse
DICE-Talk is a diffusion-based method for generating emotional talking head videos from input images and audio. It addresses the challenge of creating vivid and diverse emotional expressions in speaking portraits by disentangling identity and emotion. The project targets researchers and developers working on AI-driven animation and content creation, offering a novel approach to controllable emotional synthesis in facial animation.
How It Works
The core of DICE-Talk is a diffusion-based generative model designed to disentangle identity and emotion. It employs a correlation-aware approach to ensure that generated emotions are consistent with the input audio and visual cues, leading to more realistic and diverse emotional expressions in talking portraits. Key components include pre-trained models for audio processing (Whisper), motion guidance (pose_guider), and video generation (stable-video-diffusion-img2vid-xt).
Quick Start & Requirements
cu118
), ffmpeg
, and then pip install -r requirements.txt
.huggingface-cli
to download checkpoints for DICE-Talk, stable-video-diffusion-img2vid-xt, and whisper-tiny.python3 demo.py
with specified image, audio, and emotion paths.python3 gradio_app.py
for an interactive interface.Highlighted Details
Maintenance & Community
The project released its initial version in April 2025 with ongoing updates planned. No specific community channels (like Discord or Slack) or notable contributors/sponsorships are mentioned in the provided text.
Licensing & Compatibility
The provided README does not specify a license. This is a critical omission for evaluating adoption and compatibility, especially for commercial use.
Limitations & Caveats
The project is described as an "initial version" with "continuous updates," suggesting it may still be under active development or in an alpha/beta state. It requires a high-end GPU (20GB+ VRAM) and is tested on Linux, potentially limiting its accessibility on other operating systems or lower-spec hardware. The absence of a specified license poses a significant adoption blocker.
1 month ago
Inactive