EasyVtuber  by yuyuyzl

AI-driven VTuber app for real-time avatar animation

created 3 years ago
2,448 stars

Top 19.4% on sourcepulse

GitHubView on GitHub
Project Summary

EasyVtuber is an open-source project designed to enable users to create virtual YouTubers (VTubers) with advanced facial tracking and animation capabilities. It targets aspiring VTubers, content creators, and hobbyists looking for a high-quality, customizable, and accessible solution that rivals commercial offerings like VTube Studio. The project aims to provide a seamless and high-performance experience, particularly by leveraging advanced AI models for facial tracking, frame interpolation, and upscaling.

How It Works

EasyVtuber integrates multiple AI models to achieve its functionality. It utilizes a Talking-Head-Anime (THA) model for generating animations from static images and pose data. For enhanced fluidity, it incorporates the RIFE model for frame interpolation, effectively doubling or tripling frame rates. Upscaling is handled by waifu2x and Real-ESRGAN models to improve visual clarity. The project also features a UDP-based connection for high-refresh-rate facial tracking from iOS devices via iFacialMocap, and supports webcam input using OpenCV. TensorRT acceleration is available for NVIDIA GPUs, with DirectML support for AMD and Intel graphics cards.

Quick Start & Requirements

  • Installation: Download and extract the provided integration package or clone the repository and install dependencies using provided batch scripts (01A.构建运行环境(默认源).bat or 01B.构建运行环境(国内源).bat) or manual Conda commands.
  • Prerequisites:
    • Windows 10.
    • iPhone with FaceID for iFacialMocap (paid app required) or a webcam.
    • Any gaming-grade GPU from the last 5 years (NVIDIA, AMD, Intel supported).
    • OBS Studio or Unity Capture for output.
    • Stable Wi-Fi connection for iOS tracking.
    • Optional: CUDA Toolkit 12.6.3, cuDNN, TensorRT for NVIDIA GPUs.
  • Setup: Building TensorRT models for NVIDIA GPUs can take over 20 minutes. Refer to the official documentation for detailed setup and configuration: https://github.com/zpeng11/EasyVtuber

Highlighted Details

  • Achieves up to 60fps facial tracking from iOS devices via UDP direct connection.
  • Supports NVIDIA TensorRT acceleration and DirectML for AMD/Intel GPUs.
  • RIFE frame interpolation offers 50%-100% frame rate increase.
  • Includes waifu2x and Real-ESRGAN for upscaling and Spout2 support for native transparent channel output to OBS.

Maintenance & Community

The project is actively developed, with contributions from various individuals. Further community engagement and support can be found via links provided in the repository's README.

Licensing & Compatibility

The project's licensing is not explicitly stated in the provided README excerpt. Compatibility for commercial use or closed-source linking would require clarification of the specific license terms.

Limitations & Caveats

Frame interpolation and upscaling cannot be used simultaneously due to current implementation limitations. Some users may experience edge jitter with RIFE frame interpolation when using Spout2 output; an alternative is the OBS Virtual Camera without transparency. DirectML performance on AMD/Intel GPUs may vary due to driver and implementation differences, potentially leading to visual distortions or slower performance.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
123 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.