atom  by qixucen

Reasoning framework for Markov LLM test-time scaling (research paper)

created 5 months ago
579 stars

Top 56.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a lightweight, standalone implementation of "Atom of Thoughts" (AoT), a novel reasoning framework for Large Language Models (LLMs). AoT enhances LLM performance on diverse reasoning tasks by decomposing problems into atomic questions, forming a Markov process. It's designed for researchers and practitioners seeking to improve LLM reasoning efficiency and accuracy, with potential for integration into existing test-time scaling methods.

How It Works

AoT represents solutions as compositions of atomic questions, transforming the reasoning process into a Markov process. State transitions involve decomposing a question into a directed acyclic graph (DAG) and then contracting subquestions into a new atomic state. This approach focuses computation on effective reasoning steps, reducing waste from processing historical context, and enables AoT to act as a plugin for other scaling methods.

Quick Start & Requirements

  • Install/Run: python main.py --dataset <dataset_name> --start <start_idx> --end <end_idx> --model <model_name> [--mode atom|plugin]
  • Prerequisites: OpenAI API key and endpoint configured in apikey.py.
  • Datasets: Supports math, gsm8k, bbh, mmlu, hotpotqa, longbench.
  • Models: Supports models like gpt-4o-mini.
  • Resources: Requires Python environment and API access. Setup is minimal beyond API key configuration.

Highlighted Details

  • General reasoning capability across math, multi-choice, and multi-hop QA with a single codebase.
  • Plugin mode allows AoT to generate contracted questions for other reasoning frameworks.
  • Focuses computational resources on effective reasoning, not historical information.
  • Claims significant enhancement of LLM performance while reducing computational waste.

Maintenance & Community

The project is associated with the MetaGPT framework (ICLR 2024 Oral) and mentions upcoming integration with AFlow (ICLR 2025 Oral). The README notes significant community interest with over 380k views on a related post.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The implementation is described as "lightweight" and "standalone," suggesting it may not yet encompass all features of a more mature framework. The primary dependency is on OpenAI API access, which may incur costs and has rate limits. The project is presented in the context of a forthcoming paper (arXiv:2502.12018), indicating it may still be under active development.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.