evidently  by evidentlyai

Open-source framework for ML/LLM observability

Created 5 years ago
6,776 stars

Top 7.5% on SourcePulse

GitHubView on GitHub
Project Summary

Evidently is an open-source Python framework for evaluating, testing, and monitoring AI and ML systems, including LLMs. It supports both tabular and text data, offering over 100 built-in metrics for tasks ranging from data drift detection to RAG pipeline quality. The framework is designed for flexibility, allowing users to perform one-off evaluations or host a full monitoring service, making it suitable for researchers, data scientists, and ML engineers.

How It Works

Evidently operates through modular components: Reports and Test Suites for offline analysis and validation, and a Monitoring UI for visualizing results over time. Reports generate interactive visualizations and summaries of various quality evaluations, which can be customized with presets or individual metrics. Test Suites build upon Reports by adding pass/fail conditions, enabling automated checks for CI/CD pipelines. The framework supports custom metrics and offers an open architecture for integration with existing tools.

Quick Start & Requirements

Highlighted Details

  • Supports over 100 built-in metrics for data drift, quality, classification, regression, LLM outputs, and RAG.
  • Offers both offline evaluation (Reports, Test Suites) and live monitoring capabilities.
  • Includes a self-hostable open-source monitoring UI and a cloud offering with additional features.
  • Allows custom metric creation and integration with existing MLOps tools.

Maintenance & Community

  • Active community with a Discord server for discussion and support.
  • Regular updates and contributions are welcomed via a contribution guide.
  • Blog and Twitter accounts provide project updates and insights.

Licensing & Compatibility

  • Apache 2.0 License.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is actively developed, and while it supports a wide range of AI tasks, users should consult the documentation for the latest supported metrics and features, as specific LLM evaluation capabilities are continuously evolving.

Health Check
Last Commit

19 hours ago

Responsiveness

1 day

Pull Requests (30d)
15
Issues (30d)
1
Star History
116 stars in the last 30 days

Explore Similar Projects

Starred by Gregor Zunic Gregor Zunic(Cofounder of Browser Use), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
14 more.

openllmetry by traceloop

0.5%
7k
Open-source observability SDK for LLM applications
Created 2 years ago
Updated 1 day ago
Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

opik by comet-ml

1.2%
15k
Open-source LLM evaluation framework for RAG, agents, and more
Created 2 years ago
Updated 4 hours ago
Feedback? Help us improve.