Discover and explore top open-source AI tools and projects—updated daily.
agentscope-aiAI application evaluation and quality rewards framework
Top 87.6% on SourcePulse
OpenJudge is an open-source evaluation framework designed for AI applications like agents and chatbots. It streamlines the process of assessing quality and driving continuous optimization by providing a unified workflow for data collection, grading, evaluation, and analysis. The framework aims to simplify, professionalize, and integrate application evaluation, ultimately enhancing AI application excellence.
How It Works
The framework supports a systematic evaluation workflow: collect test data, define graders, run evaluations at scale, analyze weaknesses, and iterate. It offers a comprehensive library of production-ready graders and multiple flexible methods for building custom graders, including zero-shot rubric generation from task descriptions, data-driven rubric generation from examples, and training dedicated judge models. This approach allows for adaptable and robust quality assessment, converting grading results into reward signals for fine-tuning applications.
Quick Start & Requirements
pip install py-openjudgeasyncio and require LLM API keys (e.g., OpenAI).https://agentscope-ai.github.io/OpenJudge/Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Version 0.2.0 is not backward compatible with the legacy v0.1.x package (rm-gallery), requiring migration of imports and usage. The project appears to be actively developing, with "Planned" integrations indicating ongoing feature expansion.
17 hours ago
Inactive
braintrustdata
microsoft
openai
argilla-io