Awesome-Visual-Reinforcement-Learning by weijiawu

Visual Reinforcement Learning resource hub

Created 3 months ago

258 stars

Top 98.2% on SourcePulse

Project Summary

This repository curates papers, code, and resources for Visual Reinforcement Learning (Visual RL), a field enabling agents to learn from visual inputs. It serves researchers and practitioners by providing a structured overview of the rapidly evolving Visual RL landscape, from perception to real-world applications, anchored by a comprehensive survey paper.

How It Works

The project organizes Visual RL research along a defined trajectory, categorizing works by high-level domains such as Multimodal Large Language Models (MLLMs), visual generation, unified models, and vision-language action agents, further refined by specific tasks. This structure is detailed in the accompanying survey paper, "Reinforcement Learning in Vision: A Survey," which aims to provide a clear map of the field by highlighting representative papers for each branch.

Quick Start & Requirements

This repository serves as a curated list and survey of research papers, not a software package with direct installation requirements. Users can explore the organized list of papers and resources. The project also offers an "Awesome-Paper-Agent" tool to assist in formatting paper information for contributions.

Highlighted Details

Comprehensive coverage of Visual RL subfields including MLLMs, visual generation, unified models, and vision-language action agents.
Detailed categorization of papers, facilitating navigation through specific tasks like spatial reasoning, image editing, video generation, GUI interaction, visual navigation, and robotic manipulation.
Includes resources on benchmarks, environments, and foundational concepts like visual world models, reinforcement learning theory, and implementation details of algorithms like PPO and GRPO.
Actively maintained with a focus on community contributions for expanding the paper list and resources.

Maintenance & Community

The repository is actively maintained and welcomes community contributions through issues and pull requests for adding missing papers or information. It also provides an "Awesome-Paper-Agent" to aid in contributions. Contact is available via email (weijiawu96@gmail.com), and a citation for the survey paper is provided.

Licensing & Compatibility

No specific software license is mentioned for the repository itself. The content primarily consists of links to research papers, whose individual licenses would apply.

Limitations & Caveats

As a curated list of research papers, this repository does not provide executable code or a unified framework for Visual RL. Users must refer to individual papers for implementation details, code availability, and specific requirements. The rapidly evolving nature of the field means the list is continuously updated.

Awesome-Visual-Reinforcement-Learning by weijiawu

Explore Similar Projects

Embodied-AI-Paper-TopConf by Songwxuan

awesome-multi-modal-reinforcement-learning by opendilab

scalingup by real-stanford

Awesome_Think_With_Images by zhaochen0110

LLM-in-Vision by DirtyHarryLYL

Awesome-RL-based-Reasoning-MLLMs by Sun-Haoyuan23

allenact by allenai

awesome-embodied-vla-va-vln by jonyzhang2023

Visual-RFT by Liuziyu77

Magma by microsoft

RLBench by stepjam

Embodied-AI-Guide by TianxingChen