geneval by djghosh13

Evaluation framework for text-to-image alignment research

Created 2 years ago

406 stars

Top 71.6% on SourcePulse

Project Summary

GenEval provides an object-focused framework for evaluating text-to-image alignment, addressing the limitations of holistic metrics like FID and CLIPScore. It enables fine-grained, instance-level analysis of compositional capabilities, such as object co-occurrence, position, count, and color, making it valuable for researchers and developers of text-to-image models.

How It Works

GenEval leverages existing object detection models to analyze generated images. This approach allows for a granular assessment of how well generated images adhere to specific compositional instructions in the text prompt. By integrating with object detectors, it provides instance-level feedback on properties like object presence, spatial relationships, and attributes, offering a more insightful evaluation than global metrics.

Quick Start & Requirements

Install: Clone the repository, create and activate a Conda environment (conda env create -f environment.yml, conda activate geneval), and install mmdetection (version 2.x).
Prerequisites: Python 3.x, Conda, mmdetection, and a downloaded Mask2Former object detector model.
Setup: Requires cloning the repo, setting up a Conda environment, and potentially downloading models.
Links: Official GitHub Repo

Highlighted Details

Evaluates compositional properties: object co-occurrence, position, count, and color.
Demonstrates strong human agreement for its automated evaluation.
Provides instance-level analysis of text-to-image generation capabilities.
Benchmarks several open-source text-to-image models, showing improvements but also persistent challenges in complex spatial relations and attribute binding.

Maintenance & Community

The project is associated with the paper "GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment" by Dhruba Ghosh, Hanna Hajishirzi, and Ludwig Schmidt. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README indicates that while recent models show significant improvement, they still struggle with complex capabilities like spatial relations and attribute binding, suggesting these are areas where GenEval can highlight current model limitations.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days