Talking head synthesis via efficient region-aware neural radiance fields
Top 33.4% on sourcepulse
This repository provides an implementation of ER-NeRF, an efficient region-aware neural radiance field method for synthesizing high-fidelity talking portraits. It targets researchers and developers in computer vision and graphics working on realistic human avatar generation and animation from speech. The key benefit is high-quality, dynamic portrait synthesis with controllable facial expressions.
How It Works
ER-NeRF utilizes Neural Radiance Fields (NeRFs) to represent 3D scenes, specifically focusing on human heads and torsos. It incorporates a region-aware approach, likely segmenting the face into different regions to handle expressions and movements more effectively. This allows for disentangled control and high-fidelity rendering of talking portraits, achieving impressive visual quality.
Quick Start & Requirements
pytorch3d
from source.Highlighted Details
Maintenance & Community
The project is associated with ICCV 2023. Recent updates mention related work like InsTaG (CVPR 2025) and TalkingGaussian (ECCV 2024), indicating active development in the research group. No specific community links (Discord/Slack) are provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license. Given its academic nature and reliance on other projects, users should verify licensing for commercial use.
Limitations & Caveats
The installation process requires specific older versions of PyTorch and CUDA, which may pose compatibility challenges with newer systems. Some datasets are not distributed due to copyright, requiring users to source them independently. The "head+torso" model shows a significant drop in PSNR and LPIPS compared to the head-only model.
4 months ago
Inactive