ReasonFlux by Gen-Verse

LLM post-training algorithms for data selection, RL, and inference

Created 11 months ago

514 stars

Top 60.9% on SourcePulse

1 Expert Loves This Project

winglian

Founder of Axolotl AI

Project Summary

ReasonFlux introduces a novel template-augmented reasoning paradigm for Large Language Models (LLMs), aiming to enhance performance on complex reasoning tasks. It targets researchers and developers seeking to improve LLM capabilities in areas like mathematics and general question answering, offering a method to scale reasoning abilities through structured thought processes.

How It Works

ReasonFlux employs a hierarchical approach, leveraging "thought templates" to guide LLM reasoning. This involves a "navigator" component that selects appropriate templates from a library based on the problem context, and an "inference" model that executes the reasoning steps guided by these templates. This method allows smaller models to achieve performance comparable to or exceeding larger, more general-purpose models on specific reasoning benchmarks.

Quick Start & Requirements

Install: Clone the repository and set up a Conda environment (conda create -n ReasonFlux python==3.9, conda activate ReasonFlux, pip install -r requirements.txt).
Prerequisites: Python 3.9, llama-factory for training, lm-evaluation-harness for evaluation, and vllm for inference. Note: Avoid installing flash-attn if using jina-embedding-v3 due to potential conflicts.
Resources: Training requires significant GPU resources (e.g., 8x A100 GPUs for a 32B model). Inference with vllm is also resource-intensive.
Links: Model Zoo, ReasonFlux-F1 README, LLaMA-Factory.

Highlighted Details

ReasonFlux-F1-32B outperforms models like o1-mini and DeepSeek-R1-Distill-32B on MATH500 (96.0 vs 90.0/94.3) and AIME2024 (76.7 vs 56.7/72.6).
Supports training and inference for multiple model sizes (7B, 14B, 32B).
Utilizes a template library for structured reasoning, with an embedding-based retrieval mechanism.
Built upon preliminary works like "Buffer of Thoughts" and "SuperCorrect".

Maintenance & Community

The project is associated with the paper "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates" (arXiv:2502.06772).
Recent updates include the release of ReasonFlux-F1 models and associated training/inference code.

Licensing & Compatibility

The repository appears to be released under a permissive license, but specific terms are not explicitly detailed in the README. Model weights are available on HuggingFace.

Limitations & Caveats

The inference code for ReasonFlux-Zero requires specific paths for navigator, template matcher, and inference models, which need to be provided by the user.
Potential dependency conflicts exist, particularly between flash-attn and jina-embedding-v3.
Evaluation requires specific setup of the lm-evaluation-harness framework.

Health Check

Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

1

Star History

7 stars in the last 30 days

Explore Similar Projects

Awesome-Long2short-on-LRMs by Hongcheng-Gao

Optimizing large reasoning models for concise outputs

Created 10 months ago

Updated 5 months ago

awesome-deep-reasoning by modelscope

Collection of resources for reasoning models

Created 11 months ago

Updated 8 months ago

Starred by

Simon Willison

Simon Willison(Coauthor of Django).

XBai-o4 by MetaStone-AI

Advanced LLM for complex reasoning

Created 5 months ago

Updated 5 months ago

LLM-Reasoner by harishsg993010

SDK for LLM step-by-step reasoning, like OpenAI o1 and Deepseek R1

Created 11 months ago

Updated 11 months ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI).

Awesome-Efficient-Reasoning-LLMs by Eclipsess

Survey of efficient reasoning techniques for LLMs

Created 9 months ago

Updated 2 months ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

POLARIS by ChenxinAn-fdu

Scaling RL for advanced reasoning models

Created 6 months ago

Updated 2 months ago

Skywork-OR1 by SkyworkAI

Math/code reasoner models trained with RL

Created 9 months ago

Updated 7 months ago

ReasonGraph by ZongqianLi

Web platform for visualizing LLM reasoning processes

Created 10 months ago

Updated 7 months ago

Mulberry by HJYao00

MLLM research paper for reasoning/reflection via collective Monte Carlo Tree Search

Created 1 year ago

Updated 3 months ago

train-deepseek-r1 by FareedKhan-dev

Replicate DeepSeek R1 LLM training from scratch

Created 11 months ago

Updated 9 months ago

Starred by

Wing Lian

Wing Lian(Founder of Axolotl AI),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

2 more.

coconut by facebookresearch

Research paper implementation for LLM reasoning in latent space

Created 1 year ago

Updated 5 months ago

Starred by

Michael Han

Michael Han(Cofounder of Unsloth),

Sebastian Raschka

Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), and

19 more.

DeepSeek-R1 by deepseek-ai

Reasoning models research paper

Created 11 months ago

Updated 6 months ago

Feedback? Help us improve.