PyTorch for real-time neural talking head synthesis
Top 40.4% on sourcepulse
RAD-NeRF provides a PyTorch re-implementation for real-time neural radiance talking portrait synthesis, decomposing audio and spatial information. It's designed for researchers and developers working on realistic avatar generation and animation from audio input. The project enables the creation of dynamic, talking portraits with impressive visual fidelity.
How It Works
RAD-NeRF leverages Neural Radiance Fields (NeRF) combined with audio-spatial decomposition. It processes input videos and audio to extract facial landmarks, semantic parsing, and head poses. The core innovation lies in its ability to synthesize novel views of a portrait that accurately lip-syncs to an input audio stream, achieving real-time performance through efficient rendering and optimized NeRF representations.
Quick Start & Requirements
pip install -r requirements.txt
. Ubuntu users need sudo apt install portaudio19-dev
.pytorch3d
and downloading specific models (face-parsing, Basel Face Model).Highlighted Details
Maintenance & Community
The project is based on AD-NeRF for data pre-processing and torch-ngp for the NeRF framework. The GUI is built with DearPyGui. No specific community channels (Discord/Slack) or active maintenance signals are mentioned in the README.
Licensing & Compatibility
The repository does not explicitly state a license. However, given its reliance on other projects, users should verify licensing compatibility for commercial or closed-source use.
Limitations & Caveats
The installation and data pre-processing steps are complex and require downloading multiple external files and models. The project is tested on Ubuntu 22.04, and compatibility with other operating systems may vary. Training can be memory-intensive, especially when preloading data to GPU.
1 year ago
Inactive