AI presenter for generating synthetic speaker videos
Top 98.0% on sourcepulse
LIHQ is an application designed to generate high-quality AI presenter videos, targeting users who want to create synthetic speakers with custom faces and voices. It simplifies the complex process of AI video generation by integrating multiple open-source deep learning models, making advanced capabilities accessible with minimal setup, particularly within Google Colab environments.
How It Works
LIHQ orchestrates a pipeline of specialized models to achieve its results. It begins with a First Order Motion Model (FOMM) to transfer head and eye movements from a reference video to a user-provided face image. Subsequently, Wav2Lip synchronizes mouth movements with user-provided audio, overlaying this onto the FOMM output. The low-resolution output is then enhanced using GFPGAN for face restoration and upscaling, with an optional second pass for improved quality. Advanced options include frame interpolation for higher FPS and background matting.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is primarily designed for Colab and its local execution is noted as experimental. Achieving optimal results requires trial and error with specific face images and audio, with StyleGAN2 faces and simple narrator voices yielding the best output. Some features like frame interpolation significantly increase inference time.
2 years ago
Inactive