OmniGen  by VectorSpaceLab

Image generation model for multimodal prompts

created 10 months ago
4,231 stars

Top 11.8% on sourcepulse

GitHubView on GitHub
Project Summary

OmniGen is a unified image generation model designed to simplify multi-modal image creation, enabling users to generate diverse images from various prompts without additional plugins or preprocessing. It targets researchers and users seeking a flexible, all-in-one solution for tasks like text-to-image, subject-driven generation, and image editing.

How It Works

OmniGen employs a unified architecture that automatically interprets features from multi-modal inputs (text and images) based on the prompt. This approach eliminates the need for external modules like ControlNet or IP-Adapter, streamlining the generation process and allowing for direct control through natural language and image references.

Quick Start & Requirements

Highlighted Details

  • Supports text-to-image, subject-driven generation, identity-preserving generation, image editing, and image-conditioned generation.
  • Handles multi-modal prompts with image placeholders (e.g., <|image_1|>).
  • Offers LoRA fine-tuning capabilities with provided scripts.
  • Available in Hugging Face Diffusers library.

Maintenance & Community

Licensing & Compatibility

  • Licensed under the MIT License, permitting commercial use and closed-source linking.

Limitations & Caveats

  • The README notes that OmniGen still has room for improvement due to limited resources. Specific resource requirements for efficient operation are detailed in docs/inference.md.
Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
255 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
4 more.

open_flamingo by mlfoundations

0.1%
4k
Open-source framework for training large multimodal models
created 2 years ago
updated 11 months ago
Feedback? Help us improve.