Discover and explore top open-source AI tools and projects—updated daily.
MingXiangLEvaluation protocol for text-to-video generation models
Top 94.2% on SourcePulse
This repository provides the official implementation for the DEVIL protocol, a novel evaluation framework for text-to-video (T2V) generation models. It addresses the limitations of existing methods by focusing on the "dynamics" dimension—visual vividness and adherence to prompt-specified motion—to assess T2V quality. The target audience includes researchers and developers working on T2V generation, offering a more comprehensive and human-aligned evaluation metric.
How It Works
DEVIL introduces a new benchmark of text prompts designed to capture various dynamics grades. It defines a set of dynamics scores across different temporal granularities and uses these to assess T2V models through three metrics: dynamics range, dynamics controllability, and dynamics-based quality. This dynamics-centric approach aims to provide a more nuanced understanding of video generation capabilities beyond simple temporal consistency.
Quick Start & Requirements
pip install -e . within the geminiplayground directory, followed by pip install -r requirements.txt.bash eval_dynamics.dist.sh --video_dir dir_to_your_videos --gemini_api_key your_gemini_api_key --num_gpus 8.Highlighted Details
Maintenance & Community
The project acknowledges contributions from Vbench, EvalCrafter, geminiplayground, and ViClip. A TODO list indicates plans for a new prompt version, demo website, and PyPI package.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The naturalness metric's reliance on the Gemini API may incur costs and requires obtaining an API key. The project is still under active development with several items listed in the TODO section.
1 year ago
Inactive