DreamLite  by ByteVisionLab

On-device unified model for image generation and editing

Created 1 week ago

New!

352 stars

Top 79.1% on SourcePulse

GitHubView on GitHub
Project Summary

DreamLite presents a compact, unified on-device diffusion model (0.39B parameters) for both text-to-image generation and text-guided image editing. Targeting mobile users and developers, it enables real-time creative tasks directly on devices without cloud dependency, offering significant efficiency gains and privacy benefits.

How It Works

The architecture leverages a pruned mobile U-Net backbone and unifies conditioning through In-Context spatial concatenation within the latent space, allowing seamless integration of diverse inputs. This design, combined with step distillation, facilitates rapid 4-step inference, making complex image manipulation feasible on resource-constrained hardware. This approach is advantageous for on-device deployment due to its reduced computational footprint and memory requirements.

Quick Start & Requirements

Installation involves cloning the repository: git clone https://github.com/ByteVisionLab/DreamLite.git. The project emphasizes on-device inference capabilities, demonstrated on an iPhone 17 Pro.

Highlighted Details

  • Performance: Achieves 4-step inference for generating or editing 1024x1024 images in approximately 3 seconds on an iPhone 17 Pro, utilizing 4-bit Qwen VL and fp16 VAE+UNet.
  • On-Device Capability: Operates fully on-device, requiring no cloud connectivity, which enhances privacy and reduces latency.
  • Unified Functionality: Integrates both text-to-image generation and text-guided image editing within a single, lightweight network.
  • Model Size: Features a remarkably compact 0.39B parameter model, significantly smaller than many contemporary diffusion models.
  • Quantitative Results: Benchmarks show competitive performance, with DreamLite achieving a GenEval score of 0.72 and ImgEdit score of 4.11, outperforming some larger models in specific metrics.

Maintenance & Community

The project is under the supervision of Prof. Wangmeng Zuo. No specific community channels or detailed maintenance roadmaps are provided in the README.

Licensing & Compatibility

The README does not specify a software license, which may impact commercial use or integration.

Limitations & Caveats

The inference code and model weights are not yet released, indicating an early-stage open-source effort. Planned releases include an online demo and mobile applications, which are not currently available.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
6
Star History
355 stars in the last 13 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

0%
2k
Training-free paradigm for text-to-image generation/editing
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.