Discover and explore top open-source AI tools and projects—updated daily.
ZJU-REALGaussian reward modeling for precise GUI grounding
Top 91.7% on SourcePulse
GUI-G² introduces a novel Gaussian reward modeling framework for training models to perform GUI grounding tasks. It addresses the limitations of traditional reinforcement learning rewards by mimicking human interaction patterns, specifically the Gaussian-like spatial distributions of clicks around targets. This approach offers a more precise and robust method for training models to accurately identify and interact with GUI elements, benefiting researchers and developers working on human-computer interaction, visual language models, and automated UI agents.
How It Works
GUI-G² employs a Gaussian reward framework inspired by human click behavior observed in datasets like AITW. The core innovation lies in its reward functions: Gaussian Point Reward, which rewards proximity to target centers, and Gaussian Coverage Reward, which encourages spatial alignment with the target area. An Adaptive Variance Mechanism dynamically adjusts the reward granularity based on the GUI element's scale. This dense reward signal provides smoother gradients compared to sparse, binary RL rewards, leading to more efficient and effective early-stage learning.
Quick Start & Requirements
conda create -n gui-g2 python=3.10), activating it (conda activate gui-g2), and running bash setup.sh. Manual dependency installation includes transformers==4.49.0 and deepspeed==0.15.4.transformers, deepspeed, and potentially CUDA-enabled hardware for efficient inference/training (as indicated by device_map="cuda").Highlighted Details
Maintenance & Community
The project announced its paper acceptance to AAAI 2026 in November 2025 and open-sourced its 3B and 7B models in August 2025, following the paper release in July 2025. The primary community and code repository is hosted on GitHub.
Licensing & Compatibility
The provided README does not specify a software license. This lack of explicit licensing information presents a significant blocker for evaluating commercial use or closed-source integration compatibility.
Limitations & Caveats
Evaluation checkpoints are noted as "will be released soon," indicating that the evaluation setup might still be under active development or not fully finalized. The project's association with AAAI 2026 suggests it is a recent research contribution and may still be evolving.
2 months ago
Inactive
microsoft
bytedance