HPSv3 by MizzenAI

VLM-based model for wide-spectrum human preference scoring

Created 11 months ago

316 stars

Top 85.3% on SourcePulse

Project Summary

HPSv3 provides a Wide-Spectrum Human Preference Score for evaluating AI-generated images and offers an iterative refinement method (CoHP) to improve image quality. It targets researchers and developers in generative AI, enabling more accurate image assessment and efficient quality enhancement.

How It Works

The core is HPSv3, a VLM-based preference model built on Qwen2-VL, trained on the extensive HPDv3 dataset (1.08M text-image pairs, 1.17M comparisons) covering diverse generative models and image qualities. Complementing this is CoHP (Chain-of-Human-Preference), a novel reasoning approach for iterative image refinement. CoHP uses multi-model generation, reward scoring, and image-to-image generation to enhance quality without requiring additional training data.

Quick Start & Requirements

Inference is available via PyPI (pip install hpsv3). For development, clone the repo, set up the Conda environment, and install dependencies like flash-attn==2.7.4.post1. Basic usage involves initializing HPSv3RewardInferencer in Python. An interactive Gradio demo and a command-line interface for CoHP are also provided. Links to project website, arXiv, and model/dataset hubs are available via badges.

Highlighted Details

HPSv3 achieves state-of-the-art preference scoring, outperforming existing models on HPDv3 (76.9), Pickscore (72.8), and ImageReward (66.8).
The HPDv3 dataset is comprehensive, featuring 1.08M text-image pairs and 1.17M comparisons from diverse generative sources.
CoHP offers an efficient iterative image refinement technique, leveraging reward models to improve quality without retraining.

Maintenance & Community

Developed by researchers from Mizzen AI and CUHK MMLab, among others. Support is available via GitHub Issues and email. Recent August 2025 updates include releases for the HPDv3 dataset, inference/training code, CoHP, model weights, and a PyPI package.

Licensing & Compatibility

The README does not specify a software license, which may impact commercial use or integration into closed-source projects.

Limitations & Caveats

No specific limitations, bugs, or alpha status are detailed in the README. The flash-attn dependency suggests potential hardware-specific requirements.

Health Check

Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days