Wav2Lip by Rudrabha

Lip-syncing tool for generating videos from speech

Created 5 years ago

12,746 stars

Top 3.9% on SourcePulse

View on GitHub

4 Experts Love This Project

Shane Thomas

Cofounder of Mastra

Alex Yu

Research Scientist at OpenAI; Cofounder of Luma AI

Cristóbal Valenzuela

Cofounder of Runway

Gabriel Almeida

Cofounder of Langflow

Project Summary

Wav2Lip provides a robust solution for accurate lip-syncing in videos, even in challenging "in the wild" scenarios. It's designed for researchers and developers needing to generate realistic talking face videos from audio, supporting any identity, voice, or language, including CGI and synthetic voices.

How It Works

Wav2Lip employs a novel "Lip Sync Expert" discriminator that learns to distinguish between accurately and inaccurately lip-synced videos. This expert discriminator is then used to train a Wav2Lip model, ensuring high-fidelity synchronization. The approach is advantageous as it decouples the lip-sync accuracy from visual quality, allowing for better performance across diverse inputs.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.6, ffmpeg. Download the SFD face detection model to face_detection/detection/sfd/s3fd.pth.
Resources: Pre-trained models are available for download. Colab notebooks are provided for easier experimentation.
Links: Project Page, Demo Video, Colab Notebook

Highlighted Details

Achieves state-of-the-art lip-sync accuracy.
Works for any identity, voice, and language.
Supports CGI faces and synthetic voices.
Includes training code, inference code, and pre-trained models.
Offers evaluation benchmarks and metrics.

Maintenance & Community

The project is associated with ACM Multimedia 2020. Contact information for authors and commercial inquiries is provided.

Licensing & Compatibility

This repository is strictly for personal/research/non-commercial use due to training on the LRS2 dataset. A commercial version is available via Sync Labs API (sync.so).

Limitations & Caveats

Training on datasets other than LRS2 may require significant code modifications and may not yield good results without careful dataset preparation and synchronization. The open-source code is not intended for commercial use.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

93 stars in the last 30 days