Text-To-Video-Finetuning  by ExponentialML

Finetuning script for text-to-video models

created 2 years ago
687 stars

Top 50.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides tools for finetuning ModelScope's Text-to-Video model using the Diffusers library. It targets researchers and developers interested in customizing video generation models, offering capabilities for LoRA training and model conversion for web UIs.

How It Works

The project leverages the Diffusers library for finetuning video diffusion models. It supports training from scratch or finetuning existing models like ModelScope's Text-to-Video and community-provided checkpoints such as ZeroScope. The architecture allows for LoRA (Low-Rank Adaptation) training, enabling efficient finetuning with reduced computational resources, and includes options for gradient checkpointing and memory-efficient attention mechanisms (Xformers, Torch 2.0 SDP) to manage VRAM usage.

Quick Start & Requirements

  • Install: git clone https://github.com/ExponentialML/Text-To-Video-Finetuning.git && cd Text-To-Video-Finetuning && git lfs install && git clone https://huggingface.co/damo-vilab/text-to-video-ms-1.7b ./models/model_scope_diffusers/
  • Environment: Python 3.10, PyTorch >= 2.0 recommended.
  • Hardware: RTX 3090 recommended; 16GB VRAM GPUs can train with optimizations (validation off, Xformers/SDP, gradient checkpointing, 256 resolution, LoRA).
  • Docs: https://github.com/ExponentialML/Text-To-Video-Finetuning

Highlighted Details

  • Supports LoRA training compatible with Stable Diffusion WebUI extensions.
  • Includes scripts for converting trained Diffusers models to .ckpt format.
  • Offers automatic video captioning using Video-BLIP2-Preprocessor.
  • Allows finetuning from various community models like ZeroScope and Potat1.

Maintenance & Community

The repository is archived and will no longer be updated, with the author recommending the damo-vilab/i2vgen-xl repository for ongoing development. Issues and PRs are kept for posterity.

Licensing & Compatibility

The repository itself does not explicitly state a license in the README. The underlying models it references may have their own licenses. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

This repository is archived and no longer maintained. The author directs users to an alternative repository for current development. LoRA files trained with stable_lora are not compatible with the repository's inference.py script. Merging LoRA weights is not supported.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.