LLM reasoning via rule-based reinforcement learning, research paper
Top 19.7% on sourcepulse
This repository provides Logic-RL, a framework for enhancing Large Language Model (LLM) reasoning capabilities on logic puzzles through rule-based reinforcement learning. It targets researchers and developers aiming to improve LLM performance on complex, rule-bound tasks, offering a significant boost in accuracy compared to standard LLMs.
How It Works
Logic-RL integrates reinforcement learning with LLM inference, specifically using a rule-based reward system. This approach guides the LLM's reasoning process by rewarding adherence to logical rules and puzzle constraints, effectively steering its learning trajectory towards more accurate solutions. The framework leverages techniques from TinyZero and Verl for efficient training and deployment.
Quick Start & Requirements
conda create -n logic python=3.9
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install vllm==0.6.3 ray
pip3 install flash-attn --no-build-isolation
pip install -e .
bash main_grpo.sh
and is noted to require 4x A100 80GB GPUs.Highlighted Details
Maintenance & Community
The project is associated with authors from institutions like Tsinghua University. Further community engagement channels are not explicitly listed in the README.
Licensing & Compatibility
The repository does not explicitly state a license. The presence of dependencies like PyTorch and vLLM suggests compatibility with common ML development environments.
Limitations & Caveats
The training process is resource-intensive, requiring multiple high-end GPUs (4x A100 80GB). The project appears to be recent, with primary results published in March 2025, indicating potential for ongoing development and changes.
4 months ago
1 day