DPE  by OpenTalker

Video portrait editing research paper (CVPR 2023)

created 2 years ago
452 stars

Top 67.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code for DPE (Disentanglement of Pose and Expression), a method for general video portrait editing. It allows users to transfer pose from a driving video and expression from audio or another driving video to a source video or image, targeting researchers and practitioners in computer vision and graphics.

How It Works

DPE disentangles pose and expression to enable independent control over these facial attributes. It leverages a pre-trained model that can process source videos/images and driving videos/audio to generate edited outputs. The approach is advantageous for its ability to perform complex edits like pose transfer and expression synthesis in a generalizable manner.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create -n dpe python=3.8), activate it, install PyTorch 1.12.1 with CUDA 11.3, and then install requirements (pip install -r requirements.txt). GFPGAN is also required (pip install git+https://github.com/TencentARC/GFPGAN).
  • Pre-trained Model: Download from here and place in ./checkpoints.
  • Dependencies: Python 3.8, PyTorch 1.12.1, CUDA 11.3.
  • Demo: python run_demo.py --s_path <source_video> --d_path <driving_video> --model_path ./checkpoints/dpe.pt --face <exp|pose|both> --output_folder ./res

Highlighted Details

  • Supports video editing with pose transfer from a driving video and expression transfer from audio or a second driving video.
  • Offers one-shot editing capabilities using a single source image and driving videos for pose and expression.
  • Integrates GFPGAN for face enhancement.
  • Code is adapted from LIA, PIRenderer, and STIT.

Maintenance & Community

The project was accepted to CVPR 2023. Recent updates include code releases for one-shot driving, training, enhancement, and video editing. A Colab demo is listed as a future task.

Licensing & Compatibility

The repository is licensed for personal/research/non-commercial use only. It is not an official Tencent product.

Limitations & Caveats

The current license restricts commercial use. Audio-driven video editing is marked as "TODO" in the development roadmap.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.