Discover and explore top open-source AI tools and projects—updated daily.
yunlong10Curated research on advanced video reasoning with large multimodal models
Top 98.7% on SourcePulse
This repository serves as a comprehensive, curated "Awesome List" for researchers and developers focused on advancing the reasoning capabilities of Video Large Multimodal Models (Video-LMMs) through post-training techniques. It systematically tracks the latest papers, code, and datasets, offering a structured overview of cutting-edge research to accelerate development in this domain.
How It Works
The project categorizes Video-LMM post-training research into three primary paradigms: Reinforced Video-LMMs, which leverage reinforcement learning techniques (e.g., RLHF, DPO, GRPO) and reward models for alignment; SFT for Reasoning, focusing on supervised fine-tuning with reasoning-centric datasets and structured formats like Chain-of-Thought (CoT); and Test-Time Scaling, exploring inference-time strategies such as agentic frameworks, tool use, RAG, and long CoT. This taxonomy provides a clear framework for understanding diverse approaches to enhancing video understanding and reasoning.
Quick Start & Requirements
This repository is a curated list of research resources and does not provide direct installation or execution commands. Users are directed to individual papers for implementation details, dependencies, and setup instructions.
Highlighted Details
Maintenance & Community
The repository was initially released in June 2025 and features a survey paper published in October 2025. It actively encourages community involvement, welcoming contributions via Pull Requests.
Licensing & Compatibility
The provided README content does not specify an open-source license. This absence may present compatibility concerns for commercial use or integration into proprietary systems without further clarification.
Limitations & Caveats
As a curated list, this repository does not offer runnable code or direct implementations. Users must consult individual research papers for specific technical requirements, dependencies, and performance metrics. The focus is exclusively on post-training methodologies, potentially excluding foundational model development or pre-training aspects.
3 months ago
Inactive