zero_shot_cot  by kojima-takeshi188

Reasoning framework for LLMs, based on a NeurIPS 2022 paper

created 3 years ago
424 stars

Top 70.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official implementation for the NeurIPS 2022 paper "Large Language Models are Zero-Shot Reasoners." It enables users to explore and replicate experiments on using large language models for zero-shot reasoning tasks, particularly chain-of-thought prompting. The target audience includes AI researchers and practitioners interested in improving LLM reasoning capabilities without task-specific fine-tuning.

How It Works

The project leverages large language models (LLMs) to perform reasoning tasks by simply appending "Let's think step by step" to the prompt. This "chain-of-thought" (CoT) prompting technique guides the LLM to generate intermediate reasoning steps before arriving at a final answer, significantly improving performance on complex arithmetic and symbolic reasoning tasks. The implementation supports various prompting methods including zero-shot, zero-shot-CoT, few-shot, and few-shot-CoT.

Quick Start & Requirements

  • Installation:
    pip install torch==1.8.2+cu111 torchtext==0.9.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
    pip install -r requirements.txt
    export OPENAI_API_KEY='YOUR_OPENAI_API_KEY'
    
  • Prerequisites: Python >= 3.8, CUDA 11.1 (for PyTorch), OpenAI API key.
  • Usage:
    python main.py --method=zero_shot_cot --model=gpt3-xl --dataset=multiarith --limit_dataset_size=10 --api_time_interval=1.0
    
  • Resources: Requires access to OpenAI's InstructGPT models. Budget considerations are important due to API costs; limit_dataset_size is recommended.
  • Docs: NeurIPS 2022 Paper, arXiv

Highlighted Details

  • Demonstrates that LLMs can perform complex reasoning tasks in a zero-shot manner via chain-of-thought prompting.
  • Supports multiple prompting strategies: zero-shot, zero-shot-CoT, few-shot, and few-shot-CoT.
  • Compatible with datasets like MultiArith and GSM8K.
  • Utilizes OpenAI's InstructGPT models (e.g., gpt3-xl).

Maintenance & Community

The project is associated with the authors of the NeurIPS 2022 paper. No specific community channels or active maintenance indicators are present in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The code is provided for research purposes, and usage of OpenAI models is subject to their terms of service. Commercial use may be restricted by OpenAI's API policies.

Limitations & Caveats

The implementation relies heavily on the OpenAI API, incurring costs and requiring an API key. The api_time_interval parameter suggests potential rate limiting issues with the API. The project focuses on specific datasets and InstructGPT models, and compatibility with other LLMs or datasets may require modifications.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Simon Willison Simon Willison(Author of Django), and
9 more.

simple-evals by openai

0.4%
4k
Lightweight library for evaluating language models
created 1 year ago
updated 3 weeks ago
Feedback? Help us improve.