Discover and explore top open-source AI tools and projects—updated daily.
Framework for scaling RL to long video sequences
Top 53.9% on SourcePulse
This repository provides a full-stack framework, Long-RL, for scaling Reinforcement Learning (RL) to long video reasoning tasks. It addresses challenges in processing extended video sequences by integrating a large-scale dataset (LongVideo-Reason), a two-stage training pipeline (CoT-SFT and RL), and an efficient training infrastructure (MR-SP). The framework is designed for researchers and engineers working with vision-language models (VLMs) and long-form video content.
How It Works
Long-RL employs a two-stage training process: Chain-of-Thought Supervised Fine-Tuning (CoT-SFT) followed by Reinforcement Learning (RL). The core innovation lies in the Multi-modal Reinforcement Sequence Parallelism (MR-SP) training infrastructure, which utilizes sequence parallelism and a vLLM-based engine. This approach enables efficient processing of long videos by caching video embeddings and employing prefilling techniques, significantly speeding up RL training for extended sequences.
Quick Start & Requirements
git clone https://github.com/NVlabs/Long-RL.git
then cd Long-RL
and pip install -e .
Highlighted Details
Maintenance & Community
The project is actively maintained, with recent updates in July 2025. Key contributors include Yukang Chen, Wei Huang, and Song Han. The project builds upon EasyR1 and verl frameworks.
Licensing & Compatibility
The repository does not explicitly state a license in the README. However, it acknowledges dependencies on EasyR1, verl, vLLM, and Flow-GRPO, whose licenses should be reviewed for compatibility with commercial or closed-source use.
Limitations & Caveats
The framework is primarily demonstrated with VILA and Qwen series models, though it supports RL training on various modalities and models. Specific hardware configurations, particularly GPU memory, are critical for processing long video sequences effectively. The README does not detail specific Python version requirements beyond general compatibility.
1 week ago
Inactive