phasellm  by wgryc

LLM evaluation and workflow framework

created 2 years ago
457 stars

Top 67.1% on sourcepulse

GitHubView on GitHub
Project Summary

PhaseLLM is an open-source framework for evaluating and managing Large Language Model (LLM) driven applications. It standardizes API calls across providers like OpenAI, Cohere, and Anthropic, enabling users to compare model outputs and automate testing using advanced models to assess simpler ones, ultimately aiming to simplify the launch of robust LLM products.

How It Works

PhaseLLM provides a unified interface for interacting with various LLM providers, abstracting away differences in their APIs. Its core strength lies in its evaluation framework, which allows users to benchmark different models and prompts. A key feature is the ability to use a powerful LLM (e.g., GPT-4) to evaluate the performance of other LLMs (e.g., GPT-3.5) against defined objectives, considering factors like cost and speed.

Quick Start & Requirements

  • Install via pip: pip install phasellm
  • For local LLM execution (e.g., DollyWrapper): pip install phasellm[complete]
  • Requires API keys for supported LLM providers (OpenAI, Anthropic, Cohere).
  • Official Docs: https://phasellm.readthedocs.io/en/latest/

Highlighted Details

  • Standardizes API calls for OpenAI, Cohere, Anthropic, and other providers.
  • Built-in evaluation frameworks for comparing LLM outputs.
  • Automates model evaluation using advanced LLMs to assess simpler ones.
  • Facilitates plug-and-play integration of different models and prompts.

Maintenance & Community

  • Project initiated by Phase AI.
  • Contact for inquiries: w (at) phaseai (dot) com.
  • Follow on Twitter for updates.

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Designed for commercial use and integration with closed-source LLM products via API keys.

Limitations & Caveats

The [complete] installation option is necessary for local LLM execution, but the specific models supported locally are not detailed. The license is not specified, which may impact commercial adoption decisions.

Health Check
Last commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.