chain-of-draft by sileix

Research paper code for efficient LLM reasoning

Created 10 months ago

338 stars

Top 81.6% on SourcePulse

Project Summary

Chain-of-Draft (CoD) is a novel prompting paradigm for Large Language Models (LLMs) that aims to improve reasoning efficiency and reduce token usage. It is designed for researchers and practitioners working with LLMs on complex reasoning tasks who seek to optimize performance and cost. CoD achieves this by mimicking human cognitive processes, generating concise intermediate thoughts instead of verbose step-by-step explanations.

How It Works

CoD prompts LLMs to produce minimalistic yet informative intermediate reasoning outputs, focusing on critical insights rather than exhaustive detail. This approach contrasts with traditional Chain-of-Thought (CoT) prompting, which emphasizes verbosity. By reducing the number of tokens generated for intermediate steps, CoD significantly lowers costs and latency while maintaining or improving accuracy on various reasoning tasks.

Quick Start & Requirements

Install/Run: Execute evaluation via python evaluate.py.
Prerequisites: Supports Claude, OpenAI, and OpenAI-compatible models. API keys can be loaded from environment variables.
Configuration: Prompts and few-shot examples are located in ./configs/{task}-{prompt}.yaml. Evaluation results are stored in ./results/.
Documentation: Paper

Highlighted Details

Matches or surpasses CoT accuracy on reasoning tasks.
Uses as little as 7.6% of the tokens compared to CoT.
Reduces cost and latency significantly.
Supports multiple tasks including gsm8k, date, sports, and coin_flip.

Maintenance & Community

The project is associated with the paper "Chain of Draft: Thinking Faster by Writing Less" by Xu, Silei et al. Further community or maintenance details are not provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify any limitations, known bugs, or deprecation warnings. The project appears to be research-oriented, and its stability for production environments is not detailed.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days