VideoSys  by NUS-HPC-AI-Lab

Efficient infrastructure for advanced video generation

Created 1 year ago
2,003 stars

Top 22.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

VideoSys is an open-source infrastructure designed for efficient and user-friendly video generation. It provides a comprehensive toolkit supporting the entire pipeline from training to inference and serving, integrating cutting-edge open-source models and techniques. The system aims to accelerate AI video generation research and development by offering significant performance improvements and memory efficiency.

How It Works

The system's core advantage lies in its novel acceleration techniques: Data-Centric Parallel (DCP) dynamically adjusts distributed training configurations based on incoming data for variable sequences, achieving up to 2.1x speedup. Pyramid Attention Broadcast (PAB) enables real-time DiT-based video generation with up to 10.6x acceleration and negligible quality loss, without requiring retraining. Dynamic Sequence Parallelism (DSP) offers efficient sequence parallelism for multi-dimensional transformers, yielding 3x training and 2x inference speedups compared to state-of-the-art methods.

Quick Start & Requirements

  • Installation: Recommended setup involves creating a Python 3.10 Conda environment, cloning the repository, and installing via pip install -e ..
  • Prerequisites: Python >= 3.10, PyTorch >= 1.13 (>= 2.0 recommended), CUDA >= 11.6.
  • Demos: Easy demos are available via HuggingFace Space and Gradio.

Highlighted Details

  • Supports integration with models like CogVideoX, Vchitect-2.0, Open-Sora-Plan, Latte, and Open-Sora.
  • PAB achieves real-time DiT-based video generation at up to 21.6 FPS.
  • DSP provides significant speedups for Open-Sora inference (e.g., 22s vs 45s on 8xH800 for a 10s video).
  • DCP offers a simple, effective method to empower any video model with minimal code changes.

Maintenance & Community

The project shows active development with recent updates in late 2024. Community interaction is facilitated through a Discord server. Links to detailed papers, blogs, and documentation for its acceleration techniques are provided.

Licensing & Compatibility

The specific open-source license for VideoSys is not explicitly stated in the provided README. Further investigation of the repository is recommended for licensing details and compatibility considerations.

Limitations & Caveats

Some features are marked as "work in progress" (🟡). The README does not specify hardware requirements beyond CUDA, nor does it provide estimated setup times or resource footprints.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.1%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 5 months ago
Feedback? Help us improve.