meta-prompting  by meta-prompting

Elevating LLM reasoning through structural prompting

Created 2 years ago
264 stars

Top 96.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This repository provides the official implementation for "Meta Prompting for AI Systems," a novel framework designed to significantly elevate the reasoning capabilities of Large Language Models (LLMs). It addresses the limitations of traditional few-shot prompting by introducing example-agnostic structural templates that guide LLM problem-solving. The framework targets researchers and engineers seeking more robust, efficient, and theoretically grounded methods for interacting with LLMs, offering a path to state-of-the-art performance on complex benchmarks with substantial token efficiency gains.

How It Works

Meta Prompting (MP) employs high-level, structural templates that dictate how to think, rather than providing specific examples of what to think. This approach teaches LLMs reusable, structured reasoning procedures applicable to entire categories of tasks. The framework is rigorously formalized using category theory, where MP is modeled as a functor mapping tasks to prompts, guaranteeing compositional problem-solving. Recursive Meta Prompting (RMP) further automates prompt engineering by formalizing the LLM's self-improvement loop as a monad, enabling principled prompt refinement and generation. This contrasts sharply with few-shot methods that rely on learning patterns from concrete, content-rich examples.

Quick Start & Requirements

The prompts utilized in the paper are readily available within the /prompts directory for direct integration with LLMs. Interactive demos are accessible via custom GPTs, including CR Agent (v0.1, v0.2), MP-PT (Online Demo), and MP-ICPD (Online Demo). The foundational research paper is available at https://arxiv.org/abs/2311.11482. The README does not specify code execution requirements, installation procedures, or explicit dependencies beyond the need for an LLM.

Highlighted Details

  • Achieves state-of-the-art results on challenging benchmarks: Qwen-72B (base model) with a single meta-prompt reaches 46.3% on MATH and 83.5% on GSM8K, surpassing many fine-tuned models and even initial GPT-4 releases.
  • Demonstrates extreme efficiency and 100% success on the Game of 24 benchmark by generating a single Python program, significantly outperforming iterative methods like Tree-of-Thought (ToT).
  • Offers substantial token efficiency improvements over traditional few-shot prompting techniques.
  • Provides a rigorous mathematical foundation for prompt engineering using category theory (functors) and monads, ensuring theoretical guarantees for modularity and compositionality.

Maintenance & Community

The provided README does not include details regarding specific contributors, community support channels (such as Discord or Slack), or a public roadmap for future development.

Licensing & Compatibility

Crucially, the repository's README omits any mention of a software license. This lack of explicit licensing information presents a significant barrier to adoption, particularly for commercial use or integration into closed-source projects, and requires immediate clarification.

Limitations & Caveats

The README focuses heavily on the project's performance achievements and does not explicitly enumerate limitations, known bugs, or its development stage (e.g., alpha/beta). The primary and most critical caveat for potential adopters is the absence of clear licensing terms.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
13 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
7 more.

reasoning-gym by open-thought

0.5%
1k
Procedural dataset generator for reasoning models
Created 11 months ago
Updated 3 weeks ago
Feedback? Help us improve.