MiraData by mira-space

Video dataset for long video generation research

Created 1 year ago

509 stars

Top 61.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jiaming Song

Chief Scientist at Luma AI

Project Summary

MiraData is a large-scale video dataset designed to address limitations in existing datasets for long video generation, particularly concerning video duration and structured captions. It targets researchers and developers working on advanced video generation models, offering extended video clips and detailed, multi-faceted descriptions to improve temporal consistency and motion understanding.

How It Works

MiraData comprises video clips with an average duration of 72 seconds, significantly longer than typical datasets. Each clip is accompanied by structured captions generated using GPT-4V, providing detailed descriptions of main objects, background, style, camera movement, and overall content. This approach aims to offer richer semantic information for training and evaluating video generation models.

Quick Start & Requirements

Install dependencies: pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116 and pip install -r requirements.txt.
Download meta files from Google Drive or HuggingFace Dataset.
Use python download_data.py to download video samples.
For evaluation using MiraBench: python calculate_score.py --meta_file ...
Requires CUDA 11.6 for PyTorch.

Highlighted Details

Dataset contains 330K, 93K, 42K, and 9K data versions.
Captions include six types: short, dense, background, main object, style, and camera.
MiraBench introduces 17 evaluation metrics across 6 perspectives for long video generation.
GPT-4V was used for captioning, with a prompt strategy detailed in caption_gpt4v.py.

Maintenance & Community

Project lead: Zhaoyang Zhang.
Contact: mira-x@googlegroups.com.
Paper available on arXiv: https://arxiv.org/abs/2407.06358v1.
Project page: https://mira-space.github.io/.

Licensing & Compatibility

MiraData is under GPL-v3 License.
Stated as supported for commercial usage, with a note to contact for a commercial license.
Copyright of videos remains with original owners; dataset is for informational purposes. Commercial use of the videos themselves is restricted.

Limitations & Caveats

The dataset is primarily for informational purposes, and the copyright of the videos belongs to their original owners. Users agree not to reproduce, duplicate, copy, sell, trade, resell, or exploit any portion of the videos or derived data for commercial purposes.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days