Perceptual image similarity metric trained on human judgements
Top 62.1% on sourcepulse
DreamSim is a novel metric for perceptual image similarity that bridges the gap between low-level pixel/patch comparisons and high-level semantic embeddings. It is designed for researchers and practitioners in computer vision who need a more nuanced understanding of visual similarity, offering improved alignment with human perception for applications like image retrieval and representation learning.
How It Works
DreamSim is built by fine-tuning concatenated embeddings from CLIP, OpenCLIP, and DINO models on a dataset of human perceptual judgments. This approach leverages the strengths of both low-level and high-level feature extractors, allowing it to capture mid-level attributes like layout and pose, which are often missed by existing metrics. The model can be used as a direct similarity metric, for feature extraction, as a perceptual loss function, or for image retrieval.
Quick Start & Requirements
pip install dreamsim
or clone the repo and install requirements.Highlighted Details
Maintenance & Community
The project is associated with NeurIPS 2023 and 2024 papers. Code structure is inspired by UniverSeg.
Licensing & Compatibility
The repository does not explicitly state a license. The code borrows from other repositories, whose licenses should be checked for compatibility.
Limitations & Caveats
The NIGHTS dataset is large (58 GB or 289 GB), requiring significant storage. The project is primarily focused on Linux environments.
4 months ago
1 day