Image editing model for instruction-based manipulation
Top 48.2% on SourcePulse
HiDream-E1 is an instruction-based image editing model designed for users seeking advanced image manipulation capabilities. It builds upon the HiDream-I1 model, offering enhanced features for transforming images according to textual prompts, targeting researchers and power users in AI-driven creative workflows.
How It Works
HiDream-E1 leverages a diffusion model architecture, incorporating Llama-3.1-8B-Instruct for instruction understanding and an optional transformer model (HiDream-I1-Full) for instruction refinement. The editing process involves a two-stage denoising approach: the initial phase performs the core editing based on the prompt, while the latter phase uses the refinement model to enhance the final output, controlled by a refine_strength
parameter. This dual-stage process aims to provide both precise editing and high-quality visual refinement.
Quick Start & Requirements
pip install -r requirements.txt
, pip install -U flash-attn --no-build-isolation
, pip install -U git+https://github.com/huggingface/diffusers.git
python ./inference.py
or integrate into custom code. A Gradio demo is available via python gradio_demo.py
.Highlighted Details
refine_strength
parameter to balance editing and refinement stages.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The model and code are explicitly stated to be under development and subject to frequent updates, which may introduce breaking changes. Instruction refinement requires a VLM API key, adding an external dependency.
2 weeks ago
1 day