o1  by win4r

LLM reasoning chains via prompting strategies

Created 1 year ago
294 stars

Top 89.9% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a framework for enhancing Large Language Model (LLM) reasoning capabilities through "o1-like" dynamic reasoning chains, inspired by OpenAI's o1. It targets developers and researchers seeking to improve LLM performance on logical problems without model retraining, offering a visualized, step-by-step thinking process.

How It Works

The core approach involves dynamic Chain-of-Thought prompting, where the LLM breaks down problems into sequential reasoning steps. At each stage, the LLM can choose to continue reasoning or provide a final answer. The system prompt guides the LLM to explore alternative solutions, question previous steps, and acknowledge its limitations, thereby improving accuracy on complex logic tasks.

Quick Start & Requirements

  • Groq: Requires Groq API key and requirements.txt.
  • OpenAI: Requires OpenAI API key and requirements.txt. Run with streamlit run app_openai.py.
  • Ollama: Requires Ollama setup with specified models and .env file for OLLAMA_URL and OLLAMA_MODEL. Run with streamlit run app_ollama.py.
  • Dependencies: Python 3, venv, pip, requirements.txt, Streamlit.

Highlighted Details

  • Achieves ~70% accuracy on the "Strawberry problem" using prompting alone, significantly improving upon base model performance.
  • Supports multiple LLM backends: Llama-3.1 70b on Groq, OpenAI GPT-4o, and local Ollama models.
  • Visualizes each reasoning step, allowing users to follow the LLM's thought process.

Maintenance & Community

Originally developed by Benjamin Klieger, extended by the open-source community. Links to Bilibili and YouTube channels are provided for support.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

This is an early prototype and accuracy has not been formally evaluated, though initial testing shows significant improvement over out-of-the-box LLMs. The project is experimental and aims to inspire new prompting strategies rather than replicate OpenAI's o1.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.