ChainForge by ianarawjo

Visual environment for LLM prompt battle-testing

Created 2 years ago

2,909 stars

Top 16.3% on SourcePulse

View on GitHub

9 Experts Love This Project

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Author of "AI Engineering", "Designing Machine Learning Systems"

and 5 more!

Project Summary

ChainForge is an open-source visual programming environment designed for prompt engineering and evaluating Large Language Models (LLMs). It allows users to rapidly test and compare prompt variations, models, and settings, facilitating systematic analysis and optimization of LLM interactions. The target audience includes researchers, developers, and anyone needing to rigorously assess LLM performance.

How It Works

ChainForge utilizes a data flow paradigm, built on ReactFlow and Flask, to construct complex LLM interaction pipelines. Users visually connect nodes representing prompts, LLM providers, and evaluation metrics. This architecture enables combinatorial testing by taking the cross-product of prompt inputs and model configurations, allowing for efficient, large-scale querying and comparison of LLM responses.

Quick Start & Requirements

Install via pip: pip install chainforge
Run locally: chainforge serve
Access at: localhost:8000
Requires Python 3.8+
Supported providers include OpenAI, Anthropic, Google Gemini/PaLM2, HuggingFace, Ollama, and more.
Docker installation is also available.
Documentation: https://chainforge.ai/
Web Play version: https://chainforge.ai/play/

Highlighted Details

Visual prompt engineering and LLM hypothesis testing.
Supports comparison across prompts, prompt parameters, and models.
Features include prompt permutations, model setting variations, and evaluation nodes.
Built-in AI features for synthetic data generation and accelerated prompt engineering.
Interactive response inspector with exportable data.
Chat turn functionality for multi-turn conversations.

Maintenance & Community

Developed by Ian Arawjo and collaborators from Harvard HCI's Glassman Lab, with support from NSF grants. Open to collaborators via GitHub Issues and Pull Requests.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The web version has a limited feature set. Sharing flows is restricted to 10 flows at a time, each under 5MB, with older links breaking if the limit is exceeded. Visualization nodes currently support only numeric and boolean metrics.

Health Check

Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days