athina-evals  by athina-ai

Python SDK for LLM response evaluation

created 1 year ago
290 stars

Top 91.7% on sourcepulse

GitHubView on GitHub
Project Summary

Athina-evals provides a Python SDK for evaluating Large Language Model (LLM) responses, offering over 50 preset evaluations and support for custom ones. It's designed for AI teams focused on observability and experimentation, serving as a companion to the Athina IDE for prototyping, running experiments, and comparing datasets.

How It Works

The SDK allows programmatic execution of evaluations, with results visualized and managed within the Athina IDE. This integrated approach facilitates side-by-side dataset comparison and experiment tracking, streamlining the LLM development lifecycle.

Quick Start & Requirements

Highlighted Details

  • Over 50 preset evaluations available.
  • Supports custom evaluation creation.
  • Integrates with Athina IDE for enhanced workflow.
  • Enables side-by-side dataset comparison.

Maintenance & Community

No specific contributor or community details are provided in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The README does not detail any limitations or caveats.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.