ChainForge  by ianarawjo

Visual environment for LLM prompt battle-testing

created 2 years ago
2,696 stars

Top 17.9% on sourcepulse

GitHubView on GitHub
Project Summary

ChainForge is an open-source visual programming environment designed for prompt engineering and evaluating Large Language Models (LLMs). It allows users to rapidly test and compare prompt variations, models, and settings, facilitating systematic analysis and optimization of LLM interactions. The target audience includes researchers, developers, and anyone needing to rigorously assess LLM performance.

How It Works

ChainForge utilizes a data flow paradigm, built on ReactFlow and Flask, to construct complex LLM interaction pipelines. Users visually connect nodes representing prompts, LLM providers, and evaluation metrics. This architecture enables combinatorial testing by taking the cross-product of prompt inputs and model configurations, allowing for efficient, large-scale querying and comparison of LLM responses.

Quick Start & Requirements

  • Install via pip: pip install chainforge
  • Run locally: chainforge serve
  • Access at: localhost:8000
  • Requires Python 3.8+
  • Supported providers include OpenAI, Anthropic, Google Gemini/PaLM2, HuggingFace, Ollama, and more.
  • Docker installation is also available.
  • Documentation: https://chainforge.ai/
  • Web Play version: https://chainforge.ai/play/

Highlighted Details

  • Visual prompt engineering and LLM hypothesis testing.
  • Supports comparison across prompts, prompt parameters, and models.
  • Features include prompt permutations, model setting variations, and evaluation nodes.
  • Built-in AI features for synthetic data generation and accelerated prompt engineering.
  • Interactive response inspector with exportable data.
  • Chat turn functionality for multi-turn conversations.

Maintenance & Community

Developed by Ian Arawjo and collaborators from Harvard HCI's Glassman Lab, with support from NSF grants. Open to collaborators via GitHub Issues and Pull Requests.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The web version has a limited feature set. Sharing flows is restricted to 10 flows at a time, each under 5MB, with older links breaking if the limit is exceeded. Visualization nodes currently support only numeric and boolean metrics.

Health Check
Last commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
3
Star History
117 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
2 more.

prompt-engine by microsoft

0.0%
3k
NPM library for LLM prompt engineering
created 3 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Joe Walnes Joe Walnes(Head of Experimental Projects at Stripe), and
2 more.

prompttools by hegelai

0.3%
3k
Open-source tools for prompt testing and experimentation
created 2 years ago
updated 11 months ago
Feedback? Help us improve.