gepa  by gepa-ai

AI-driven evolution for system text components

Created 1 month ago
671 stars

Top 50.2% on SourcePulse

GitHubView on GitHub
Project Summary

GEPA is a framework for optimizing text-based components within any system, such as AI prompts, code, or specifications, using an evolutionary approach driven by LLM reflection. It targets developers and researchers seeking to enhance system performance by iteratively refining these text components against defined evaluation metrics, offering a method to achieve robust, high-performing variants with efficient evaluation budgets.

How It Works

GEPA employs a "Reflective Text Evolution" strategy. It uses Large Language Models (LLMs) to analyze feedback from system execution and evaluation traces, reflecting on performance to generate targeted mutations for text components. Candidates are iteratively mutated, evaluated, and selected using a Pareto-aware approach, allowing for the co-evolution of multiple components within modular systems to achieve domain-specific improvements.

Quick Start & Requirements

Highlighted Details

  • DSPy Integration: Offers a direct dspy.GEPA API for seamless integration with the DSPy framework, simplifying prompt optimization tasks.
  • Performance Gains: Demonstrated improvements include boosting GPT-4.1 Mini's performance from 46.6% to 56.6% on the AIME benchmark and evolving DSPy programs to achieve 93% accuracy on the MATH benchmark (up from 67%).
  • Adapter Abstraction: Features a flexible GEPAAdapter interface, enabling GEPA to plug into diverse systems, including single-turn LLM interactions, multi-turn agents (e.g., terminal-bench), and full program evolution.
  • Broad Applicability: Capable of optimizing various text components, from system prompts and code snippets to complex program logic and control flow.

Maintenance & Community

The project is associated with authors from the paper, including Lakshya A Agrawal and Matei Zaharia. Community engagement is encouraged via GitHub issues for support and feature requests, and discussions can be held on Discord. Updates and announcements are shared on X (formerly Twitter) via @LakshyAAAgrawal and @lateinteraction.

Licensing & Compatibility

The provided README does not explicitly state the software license. Users should verify licensing terms before adoption, especially concerning commercial use or integration into closed-source projects.

Limitations & Caveats

Practical application requires access to and configuration of specific LLMs, often necessitating API keys. The optimization process itself can be computationally intensive and may require careful tuning of parameters like max_metric_calls to balance performance gains with evaluation budgets.

Health Check
Last Commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
39
Issues (30d)
22
Star History
623 stars in the last 30 days

Explore Similar Projects

Starred by Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
3 more.

Trace by microsoft

0.5%
645
AutoDiff-like tool for end-to-end AI agent training with general feedback
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.