CelebV-Text by celebv-text

Dataset for facial text-to-video generation research

Created 3 years ago

406 stars

Top 71.2% on SourcePulse

Project Summary

CelebV-Text is a large-scale dataset designed to address the lack of high-quality, text-annotated video data for facial text-to-video generation tasks. It targets researchers and developers in AI-driven video editing and generation, providing a comprehensive resource to advance facial animation and manipulation based on textual descriptions.

How It Works

The dataset comprises 70,000 in-the-wild face video clips, totaling approximately 279 hours. Each video is paired with 20 semi-automatically generated text descriptions that precisely capture both static and dynamic facial attributes. This approach ensures rich, relevant annotations covering general appearances, detailed features, lighting conditions, actions, emotions, and light directions, facilitating more accurate and controllable text-to-video synthesis.

Quick Start & Requirements

Dataset Download: Text descriptions and metadata are available via Google Drive links. Video download requires a Python script using youtube_dl and opencv-python.
Prerequisites: Python 3.x, youtube_dl, opencv-python.
Resources: Requires significant disk space for video storage and processing.
Links: Paper, Project Page, Demo Video

Highlighted Details

Contains 70,000 video clips with 20 text descriptions per clip, covering 6 attribute categories.
Includes a benchmark with representative methods (MMVID, TFGAN) for standardized evaluation.
Pretrained models for benchmark baselines are provided.
Demonstrates potential for enabling Visual GPT applications.

Maintenance & Community

The project is affiliated with OpenXDLab. Updates are provided via GitHub issues. Links to related work and potential future interests are listed.

Licensing & Compatibility

The CelebV-Text dataset is available for non-commercial research purposes only. Redistribution and commercial exploitation are strictly prohibited. Copies are allowed for internal use within a single organization.

Limitations & Caveats

The dataset is strictly for non-commercial research use. Users agree not to reproduce, duplicate, copy, sell, trade, resell, or exploit any portion for commercial purposes. Further distribution is also restricted.

CelebV-Text by celebv-text

Explore Similar Projects

tarsier by bytedance

Magic-Me by Zhen-Dong

ShareGPT4Video by ShareGPT4Omni

clipify by louisedesadeleer

openclip by linzzzzzz

t2v_metrics by linzhiqiu

MoneyPrinterTurbo-Extended by Asad-Ismail

memo by memoavatar

text2video by bravekingzhang

open-chat-video-editor by SCUTlihaoyu

LWM by LargeWorldModel

video-use by browser-use