openr by openreasoner

Open-source framework for advanced LLM reasoning

Created 1 year ago

1,830 stars

Top 23.4% on SourcePulse

View on GitHub

3 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Yaowei Zheng

Author of LLaMA-Factory

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

OpenR is an open-source framework designed to enhance the reasoning capabilities of Large Language Models (LLMs). It targets researchers and developers working on complex problem-solving tasks, particularly in areas like mathematical reasoning, by providing tools for data generation, policy training, and advanced search strategies. The framework aims to improve LLM performance through process supervision and reinforcement learning techniques.

How It Works

OpenR employs a multi-faceted approach to LLM reasoning. It supports process-supervision data generation using methods like OmegaPRM, enabling models to learn from intermediate reasoning steps rather than just final outcomes. For training, it integrates online policy training with algorithms such as APPO, GRPO, and TPPO, alongside generative and discriminative PRM training. The framework also offers diverse search strategies, including Greedy Search, Best-of-N, Beam Search, MCTS, and rStar, allowing for flexible exploration of reasoning paths.

Quick Start & Requirements

Installation: Use Conda to create an environment (conda create -n open_reasoner python=3.10), activate it (conda activate open_reasoner), and install dependencies (pip install -r requirements.txt, pip3 install "fschat[model_worker,webui]", pip install -U pydantic). Additional setup is required for latex2sympy.
Prerequisites: Requires downloading specific base models (e.g., Qwen2.5-Math, Mistral variants) from Hugging Face. Configuration involves setting environment variables for model paths and worker counts.
Resources: Running inference requires starting LM and RM services. Training involves modifying script parameters for dataset and model paths.
Links: Paper, Docs, Demo

Highlighted Details

Supports MCTS reasoning and rStar for improved LLM problem-solving.
Offers online policy training with APPO, GRPO, and TPPO.
Features process-supervision data generation (OmegaPRM) and generative RM training.
Includes benchmark results showing competitive performance on reasoning tasks.

Maintenance & Community

The project is maintained by the Openreasoner Team. Contributions are welcomed, with guidance provided. Community engagement is encouraged via WeChat.

Licensing & Compatibility

OpenR is released under the MIT License, which permits commercial use and integration with closed-source projects.

Limitations & Caveats

The README indicates that test-time computation and scaling laws are "TBA" and lists multi-modal reasoning and reasoning in code generation as future TODOs, suggesting these areas may be less mature or incomplete.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days