Discover and explore top open-source AI tools and projects—updated daily.
VainFLLM intelligently decides when to think
Top 100.0% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Thinkless is a learnable framework enabling LLMs to adaptively choose between concise and detailed reasoning based on task complexity and model confidence. It addresses the computational inefficiency of reasoning-intensive LLMs by reducing unnecessary long-form thinking, offering significant efficiency gains of 50%-90% on benchmarks. This project is targeted at researchers and engineers seeking to optimize LLM reasoning performance.
How It Works
The core innovation is a reinforcement learning paradigm employing two control tokens to switch between short and long reasoning modes. It utilizes a novel Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which decomposes the learning objective into distinct control token and response accuracy losses. This approach stabilizes training and prevents common RL collapse issues, allowing for fine-grained control over reasoning mode selection.
Quick Start & Requirements
Setup involves creating a Conda environment with Python 3.10. Installation for training requires specific versions of PyTorch (2.4.0), lm_eval (0.4.8), ray (2.45.0), and nvidia-cublas-cu12. The project provides a Python snippet for quick inference using Hugging Face Transformers. Links to the paper (ArXiv), SFT code, RL model (Thinkless-1.5B-RL-DeepScaleR), and datasets are available.
Highlighted Details
Empirically, Thinkless reduces long-chain thinking by 50%-90% on benchmarks like Minerva Algebra, MATH-500, and GSM8K. It offers pre-trained 1.5B parameter models for RL and warmup phases. Evaluation scripts for LM-Eval and custom answer extraction are included, facilitating performance assessment.
Maintenance & Community
The project acknowledges contributions from agentica-project/rllm (DeepScaleR) and Megatron-LM. It utilizes datasets like DeepScaleR and OpenThoughts2-1M. No explicit community channels (Discord/Slack) or roadmap details are provided in the README.
Licensing & Compatibility
The repository's license is not explicitly stated in the README, which requires clarification for adoption. Compatibility for commercial use or closed-source linking is therefore undetermined.
Limitations & Caveats
The TODO list indicates ongoing development for resume training, larger models (7B), and releasing warmup code. The current implementation may favor conciseness, potentially requiring hyperparameter tuning (e.g., correct_think_reward) for balanced performance. Specific CUDA versions and library dependencies can pose setup challenges.
4 months ago
Inactive
OpenGVLab