OmniGen by VectorSpaceLab

Image generation model for multimodal prompts

Created 1 year ago

4,328 stars

Top 11.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jiaming Song

Chief Scientist at Luma AI

Project Summary

OmniGen is a unified image generation model designed to simplify multi-modal image creation, enabling users to generate diverse images from various prompts without additional plugins or preprocessing. It targets researchers and users seeking a flexible, all-in-one solution for tasks like text-to-image, subject-driven generation, and image editing.

How It Works

OmniGen employs a unified architecture that automatically interprets features from multi-modal inputs (text and images) based on the prompt. This approach eliminates the need for external modules like ControlNet or IP-Adapter, streamlining the generation process and allowing for direct control through natural language and image references.

Quick Start & Requirements

Install via pip: pip install -e . after cloning the repository.
Requires Python 3.10.13 and PyTorch with CUDA support (e.g., pip install torch==2.3.1+cu118 torchvision --extra-index-url https://download.pytorch.org/whl/cu118).
Official Hugging Face Demo: https://huggingface.co/spaces/VectorSpaceLab/OmniGen
Inference Documentation: docs/inference.md
Fine-tuning Documentation: docs/fine-tuning.md

Highlighted Details

Supports text-to-image, subject-driven generation, identity-preserving generation, image editing, and image-conditioned generation.
Handles multi-modal prompts with image placeholders (e.g., <|image_1|>).
Offers LoRA fine-tuning capabilities with provided scripts.
Available in Hugging Face Diffusers library.

Maintenance & Community

Active development with recent updates (Oct-Dec 2024) including inference code optimization and dataset release.
Contact for inquiries: 2906698981@qq.com, wangyueze@tju.edu.cn, zhengliu1026@gmail.com.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and closed-source linking.

Limitations & Caveats

The README notes that OmniGen still has room for improvement due to limited resources. Specific resource requirements for efficient operation are detailed in docs/inference.md.

Health Check

Last Commit

7 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days