opik by comet-ml

Open-source LLM evaluation framework for RAG, agents, and more

Created 2 years ago

17,223 stars

Top 2.7% on SourcePulse

View on GitHub

9 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Luis Capelo

Cofounder of Lightning AI

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Sourabh Bajaj

Cofounder of Uplimit

and 5 more!

Project Summary

Opik is an open-source platform designed to help developers debug, evaluate, and monitor Large Language Model (LLM) applications, including RAG systems and agentic workflows. It offers comprehensive tracing, automated evaluations using "LLM as a judge" metrics, and production-ready dashboards, aiming to improve the performance, speed, and cost-efficiency of LLM-based systems.

How It Works

Opik provides a Python SDK and a local or hosted platform for logging and analyzing LLM interactions. It captures detailed traces of LLM calls, user feedback, and prompt variations. The core advantage lies in its integrated evaluation suite, which includes pre-built "LLM as a judge" metrics for complex tasks like hallucination detection and relevance scoring, alongside heuristic metrics and customizability, enabling automated quality assessment and CI/CD integration.

Quick Start & Requirements

Install SDK: pip install opik
Configure: opik configure (or opik.configure(use_local=True) in code)
Local Deployment: Clone repo, run ./opik.sh (Linux/Mac) or .\opik.ps1 (Windows). Access at localhost:5173.
Integrations: Supports OpenAI, LiteLLM, LangChain, Haystack, Anthropic, Bedrock, CrewAI, and more.
Documentation: Website, Documentation

Highlighted Details

Comprehensive tracing for LLM calls across various frameworks.
"LLM as a judge" metrics for automated evaluation of relevance, hallucination, etc.
CI/CD integration via PyTest for automated testing.
High-volume trace ingestion capability for production monitoring.
Local or Comet.com hosted deployment options.

Maintenance & Community

Developed by Comet.
Active development indicated by recent version updates (e.g., 1.7.0).
Community support via Slack and Twitter.
Clear contributing guidelines.

Licensing & Compatibility

MIT License.
Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The platform is actively evolving, with a note about significant changes in version 1.7.0 requiring users to check the changelog. While many integrations are listed, users of unlisted frameworks may need to implement custom tracking via the @opik.track decorator.

Health Check

Last Commit

21 hours ago

Responsiveness

1 day

Pull Requests (30d)

251

Issues (30d)

Star History

565 stars in the last 30 days