Open-Sora  by hpcaitech

Video generation initiative for efficient, high-quality video production

created 1 year ago
26,960 stars

Top 1.5% on sourcepulse

GitHubView on GitHub
Project Summary

Open-Sora is an open-source initiative focused on democratizing efficient, high-quality video production. It provides accessible models, tools, and training code for researchers and content creators looking to simplify complex video generation tasks. The project aims to foster innovation and inclusivity in AI-driven video creation.

How It Works

Open-Sora leverages a diffusion model architecture, incorporating advancements like 3D-VAE, rectified flow, and score conditioning for improved video quality. It supports a full pipeline from data preprocessing to accelerated training and inference, enabling efficient generation across various resolutions, lengths, and aspect ratios.

Quick Start & Requirements

  • Installation: pip install -v . (or -e for development). Requires PyTorch >= 2.4.0. Install xformers and flash-attn for optimized performance.
  • Prerequisites: Python 3.10+, CUDA 12.1+ for xformers.
  • Model Download: Available via Hugging Face or ModelScope.
  • Resources: Training costs are cited as low ($200K for 11B model). Inference on H100/H800 GPUs shows efficient performance (e.g., 60s for 256x256 on 1x GPU).
  • Demos & Docs: Gradio Demo, Gallery, Tech Reports.

Highlighted Details

  • Open-Sora 2.0 (11B) achieves performance on par with HunyuanVideo 11B and Step-Video 30B on VBench and human preference tests.
  • Supports text-to-video, image-to-video, and video-to-video generation, with flexible aspect ratios and lengths.
  • Offers prompt refinement using ChatGPT and dynamic motion score evaluation.
  • Training costs are significantly reduced, with claims of 50% savings and a $200K cost for the 11B model.

Maintenance & Community

The project is actively developed with multiple version branches (v1.0, v1.1, v1.2, v1.3, main). Key contributors are listed, and acknowledgements include significant contributions from ColossalAI, DiT, OpenDiT, PixArt, Flux, and StabilityAI VAE.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify licensing details, which may impact commercial adoption. While performance claims are strong, direct comparisons to state-of-the-art proprietary models like Sora are ongoing. The project is under active development, suggesting potential for breaking changes across versions.

Health Check
Last commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
6
Star History
772 stars in the last 90 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Author of SGLang), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
1 more.

Open-Sora-Plan by PKU-YuanGroup

0.1%
12k
Open-source project aiming to reproduce Sora-like T2V model
created 1 year ago
updated 2 weeks ago
Feedback? Help us improve.