Open-Sora  by hpcaitech

Video generation initiative for efficient, high-quality video production

Created 1 year ago
27,206 stars

Top 1.4% on SourcePulse

GitHubView on GitHub
Project Summary

Open-Sora is an open-source initiative focused on democratizing efficient, high-quality video production. It provides accessible models, tools, and training code for researchers and content creators looking to simplify complex video generation tasks. The project aims to foster innovation and inclusivity in AI-driven video creation.

How It Works

Open-Sora leverages a diffusion model architecture, incorporating advancements like 3D-VAE, rectified flow, and score conditioning for improved video quality. It supports a full pipeline from data preprocessing to accelerated training and inference, enabling efficient generation across various resolutions, lengths, and aspect ratios.

Quick Start & Requirements

  • Installation: pip install -v . (or -e for development). Requires PyTorch >= 2.4.0. Install xformers and flash-attn for optimized performance.
  • Prerequisites: Python 3.10+, CUDA 12.1+ for xformers.
  • Model Download: Available via Hugging Face or ModelScope.
  • Resources: Training costs are cited as low ($200K for 11B model). Inference on H100/H800 GPUs shows efficient performance (e.g., 60s for 256x256 on 1x GPU).
  • Demos & Docs: Gradio Demo, Gallery, Tech Reports.

Highlighted Details

  • Open-Sora 2.0 (11B) achieves performance on par with HunyuanVideo 11B and Step-Video 30B on VBench and human preference tests.
  • Supports text-to-video, image-to-video, and video-to-video generation, with flexible aspect ratios and lengths.
  • Offers prompt refinement using ChatGPT and dynamic motion score evaluation.
  • Training costs are significantly reduced, with claims of 50% savings and a $200K cost for the 11B model.

Maintenance & Community

The project is actively developed with multiple version branches (v1.0, v1.1, v1.2, v1.3, main). Key contributors are listed, and acknowledgements include significant contributions from ColossalAI, DiT, OpenDiT, PixArt, Flux, and StabilityAI VAE.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify licensing details, which may impact commercial adoption. While performance claims are strong, direct comparisons to state-of-the-art proprietary models like Sora are ongoing. The project is under active development, suggesting potential for breaking changes across versions.

Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
2
Star History
200 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.