minisora by mini-sora

Community initiative exploring Sora implementation and development

Created 2 years ago

1,286 stars

Top 30.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Vincent Weisser

Cofounder of Prime Intellect

Project Summary

This repository is a community-driven initiative focused on exploring and replicating the technology behind OpenAI's Sora, a text-to-video generation model. It aims to provide accessible implementations and foster research into diffusion models for video generation, targeting researchers and developers interested in state-of-the-art video synthesis.

How It Works

The project centers on reproducing key research papers and technologies related to Sora, such as DiT (Diffusion Transformer). It leverages existing frameworks like XTuner for efficient sequence training and aims to develop GPU-friendly and training-efficient models. The approach involves a comprehensive review of diffusion models for video generation, from DDPM to advanced transformer-based architectures.

Quick Start & Requirements

Installation: Not explicitly detailed, but likely involves Python and PyTorch.
Requirements: The project aims for GPU-friendly operation, targeting configurations like 8x A100 80GB, 8x A6000 48GB, or RTX4090 24GB for training and inference. Specific requirements for reproducing DiT mention 2x A100.
Resources: The project is actively recruiting contributors familiar with OpenMMLab's MMEngine and DiT.
Links: MiniSora-DiT

Highlighted Details

Focus on reproducing DiT (Scalable Diffusion Models with Transformers).
Aims for GPU-friendly training and inference with moderate hardware.
Comprehensive survey of video generation models and related technologies.
Community-driven exploration of Sora's implementation and future directions.

Maintenance & Community

The project is driven by the MiniSora Community, with regular round-table discussions involving the Sora team and community members. It actively recruits contributors and provides links to WeChat groups for community engagement.

Licensing & Compatibility

The repository's license is not explicitly stated in the README.

Limitations & Caveats

The project is a community effort to replicate complex research; therefore, the fidelity and performance of reproduced models may vary. Specific implementation details and stability are subject to ongoing community development.

minisora by mini-sora

Explore Similar Projects

BakLLaVA by SkunkworksAI

Chatglm_lora_multi-gpu by liangwq

ddpo-pytorch by kvablack

fairseq2 by facebookresearch

HunyuanVideo-I2V by Tencent-Hunyuan

musubi-tuner by kohya-ss

Pyramid-Flow by jy0205

VILA by NVlabs

DialoGPT by microsoft

NExT-GPT by NExT-GPT

LLaVA by haotian-liu

ColossalAI by hpcaitech