Discover and explore top open-source AI tools and projects—updated daily.
eric-ai-labEnhancing LLM reasoning via continuous concept spaces
Top 96.4% on SourcePulse
Summary
This project provides the official implementation for "Soft Thinking," a method to enhance Large Language Model (LLM) reasoning capabilities by operating within a continuous concept space. It targets researchers and engineers seeking to unlock deeper analytical potential in LLMs, offering improved reasoning performance.
How It Works
Soft Thinking operates LLMs in a continuous concept space, enabling more nuanced reasoning than discrete token generation. The implementation incorporates optional Dirichlet and Gumbel-Softmax noise injection during sampling, detailed in a related study, to further explore and refine conceptual representations. This approach aims to demystify and improve LLM reasoning mechanisms.
Quick Start & Requirements
st), installing core Python packages (torch, transformers, accelerate, flash_attn), and installing a tailored version of SGLang (sglang_soft_thinking_pkg). Docker installation is recommended for environment consistency.flash_attn installation may take up to 20 minutes.docker.sh), environment configuration (configure.sh), baseline script (scripts/baseline/qwq32b.sh), and inference script (run_sglang_softthinking.py).Highlighted Details
gpt-4.1-2025-04-14) for result validation.aime2024.max_topk, min_p, and early stopping thresholds.Maintenance & Community
The provided README does not detail community channels (e.g., Discord, Slack), roadmap, or notable contributors.
Licensing & Compatibility
The project features a dual licensing structure: original code is under the permissive MIT License, while the modified SGLang package (sglang_soft_thinking_pkg) is licensed under Apache License 2.0. Both licenses generally permit commercial use, with Apache 2.0 requiring standard attribution and notice.
Limitations & Caveats
Reproducibility across different hardware is challenging due to potential precision differences; Docker is strongly recommended. A multiprocessing bug affects coding benchmarks, necessitating a specific execution order using the --reeval flag. flash_attn installation can be lengthy.
2 months ago
Inactive
facebookresearch
dair-ai