tree-of-thought-llm  by princeton-nlp

Research paper implementation for Tree of Thoughts (ToT) prompting

created 2 years ago
5,468 stars

Top 9.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official implementation for the "Tree of Thoughts" (ToT) framework, enabling large language models (LLMs) to engage in deliberate problem-solving. It is designed for researchers and practitioners looking to enhance LLM reasoning capabilities beyond standard prompting methods, offering a structured approach to complex tasks.

How It Works

ToT introduces a framework that allows LLMs to explore multiple reasoning paths, akin to a search tree. It decomposes problems into intermediate thoughts, evaluates their potential, and uses search algorithms (like Breadth-First Search or Depth-First Search) to navigate these thought trees. This deliberate exploration and evaluation process aims to improve performance on tasks requiring strategic planning and complex reasoning.

Quick Start & Requirements

  • Install via pip: pip install tree-of-thought-llm
  • Alternatively, install from source: git clone https://github.com/princeton-nlp/tree-of-thought-llm && cd tree-of-thought-llm && pip install -r requirements.txt && pip install -e .
  • Requires an OpenAI API key set as the OPENAI_API_KEY environment variable.
  • Official documentation and examples are available in the repository.

Highlighted Details

  • Implements BFS and DFS search algorithms for exploring thought trees.
  • Supports various generation and evaluation methods (propose/sample, value/vote).
  • Includes official prompts and model outputs for tasks like Game of 24, creative writing, and crosswords.
  • Provides scripts for reproducing paper experiments.

Maintenance & Community

The project is associated with Princeton NLP and the authors of the NeurIPS 2023 paper. Contact is available via email (shunyuyao.cs@gmail.com) or GitHub issues.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing terms for commercial or closed-source use.

Limitations & Caveats

The README notes that reproduced experiments for Game of 24 achieved a 69% score, down from the paper's reported 74%, attributed to GPT decoding randomness. The original experiment was conducted in a notebook, and aggregation of multiple runs is planned to account for sampling variability.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
211 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.