prompts-royale by meistrari

Prompt engineering tool for automated A/B testing

Created 2 years ago

604 stars

Top 54.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

Prompts Royale is a web application designed to streamline prompt engineering by automating the creation and comparison of prompt candidates. It targets prompt engineers and AI developers seeking to optimize LLM instructions through an iterative, data-driven approach, ultimately identifying the most effective prompts for specific tasks.

How It Works

The application employs a Monte Carlo matchmaking system coupled with an ELO rating algorithm to rank prompt candidates. Each prompt is represented by a normal distribution of its ELO score. Duels are initiated probabilistically based on these distributions, with a separate prompt evaluating the quality of responses to given test cases. Prompt ELO scores and their associated standard deviations are updated post-duel using established ELO formulas, allowing for efficient convergence towards superior prompts.

Quick Start & Requirements

Clone the repository: git clone git@github.com:meistrari/prompts-royale.git
Install dependencies using Bun: bun i
Run the development server: bun run dev
Requires Node.js v16+

Highlighted Details

Automatic generation of prompt candidates and test cases from user-defined descriptions and scenarios.
Monte Carlo matchmaking for efficient data gathering with minimal battles.
ELO rating system for robust prompt ranking based on win/loss history.
Customizable parameters for fine-tuning the battle and ranking process.

Maintenance & Community

The project was heavily inspired by mshumer/gpt-prompt-engineer. Further community and roadmap information is not explicitly detailed in the README.

Licensing & Compatibility

The repository's license is not specified in the provided README.

Limitations & Caveats

The project relies on Bun for dependency management, which is a less common choice than npm or yarn. The README does not specify the LLM APIs used or provide details on their integration, nor does it mention licensing, which could impact commercial use.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days