DreamLite by ByteVisionLab

On-device unified model for image generation and editing

Created 3 months ago

720 stars

Top 46.9% on SourcePulse

Project Summary

DreamLite presents a compact, unified on-device diffusion model (0.39B parameters) for both text-to-image generation and text-guided image editing. Targeting mobile users and developers, it enables real-time creative tasks directly on devices without cloud dependency, offering significant efficiency gains and privacy benefits.

How It Works

The architecture leverages a pruned mobile U-Net backbone and unifies conditioning through In-Context spatial concatenation within the latent space, allowing seamless integration of diverse inputs. This design, combined with step distillation, facilitates rapid 4-step inference, making complex image manipulation feasible on resource-constrained hardware. This approach is advantageous for on-device deployment due to its reduced computational footprint and memory requirements.

Quick Start & Requirements

Installation involves cloning the repository: git clone https://github.com/ByteVisionLab/DreamLite.git. The project emphasizes on-device inference capabilities, demonstrated on an iPhone 17 Pro.

Highlighted Details

Performance: Achieves 4-step inference for generating or editing 1024x1024 images in approximately 3 seconds on an iPhone 17 Pro, utilizing 4-bit Qwen VL and fp16 VAE+UNet.
On-Device Capability: Operates fully on-device, requiring no cloud connectivity, which enhances privacy and reduces latency.
Unified Functionality: Integrates both text-to-image generation and text-guided image editing within a single, lightweight network.
Model Size: Features a remarkably compact 0.39B parameter model, significantly smaller than many contemporary diffusion models.
Quantitative Results: Benchmarks show competitive performance, with DreamLite achieving a GenEval score of 0.72 and ImgEdit score of 4.11, outperforming some larger models in specific metrics.

Maintenance & Community

The project is under the supervision of Prof. Wangmeng Zuo. No specific community channels or detailed maintenance roadmaps are provided in the README.

Licensing & Compatibility

The README does not specify a software license, which may impact commercial use or integration.

Limitations & Caveats

The inference code and model weights are not yet released, indicating an early-stage open-source effort. Planned releases include an online demo and mobile applications, which are not currently available.

Health Check

Last Commit

4 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

32 stars in the last 30 days