TATS  by songweige

PyTorch code for long video generation research paper

created 3 years ago
283 stars

Top 93.3% on sourcepulse

GitHubView on GitHub
Project Summary

TATS is a PyTorch framework for generating long videos, addressing the challenge of creating thousands of frames from models trained on shorter sequences. It is designed for researchers and practitioners in generative AI and video synthesis.

How It Works

TATS employs a two-stage approach: a Time-Agnostic VQGAN for efficient video tokenization and a Time-Sensitive Transformer for autoregressive generation. This separation allows the VQGAN to learn robust visual representations independent of temporal dynamics, while the transformer focuses on capturing temporal coherence, enabling the generation of significantly longer videos than typically possible with direct autoregressive models.

Quick Start & Requirements

  • Install: conda create -n tats python=3.8, conda activate tats, conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch, pip install pytorch-lightning==1.5.4 einops ftfy h5py imageio imageio-ffmpeg regex scikit-video tqdm
  • Prerequisites: PyTorch with CUDA 10.2, Python 3.8.
  • Resources: Training examples suggest multi-GPU setups (e.g., 8 GPUs) and significant training steps (up to 2M).
  • Docs: Project Website, Paper, Video.

Highlighted Details

  • Supports generation of long videos (thousands of frames) from short-trained models via sliding window.
  • Offers conditional generation for text and audio inputs.
  • Includes hierarchical sampling methods for enhanced control.
  • Provides scripts for both short and long video synthesis, FVD computation, and training VQGANs and Transformers.

Maintenance & Community

The project is associated with ECCV 2022 and cites contributions from VQGAN and VideoGPT. No specific community channels (Discord/Slack) or active maintenance signals are present in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

The setup requires specific older versions of PyTorch (CUDA 10.2) and PyTorch Lightning (1.5.4), which may pose compatibility challenges with newer environments. Training examples indicate a substantial hardware requirement (multiple GPUs).

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.