Discover and explore top open-source AI tools and projects—updated daily.
NoviSclResearch paper for converting visual design into code implementation
Top 57.5% on SourcePulse
This repository provides a benchmark dataset and tools for evaluating the automation of front-end web development from visual designs. It targets researchers and engineers working on multimodal large language models (VLMs) for UI generation, offering a standardized way to measure progress in converting screenshots to functional code.
How It Works
The project introduces the Design2Code benchmark, comprising real-world webpages converted from screenshots to HTML. It facilitates evaluation of VLMs by providing code for automatic metrics (Block-Match, Text, Position, Color, CLIP) and supports running prompting experiments with models like GPT-4V, Gemini Pro Vision, and Claude 3.5. The core innovation lies in the curated dataset and evaluation framework designed to challenge and quantify VLM capabilities in visual-to-code translation.
Quick Start & Requirements
pip install -e .playwright installHighlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The base CogAgent-18B model is noted as performing poorly on this task without fine-tuning. The provided scripts for API access might require minor adjustments for direct OpenAI API calls.
1 year ago
Inactive
huggingface
LiveCodeBench
Codium-ai
deepseek-ai