Discover and explore top open-source AI tools and projects—updated daily.
Agent observability and self-learning toolkit
Top 37.1% on SourcePulse
Judgeval provides open-source tooling for tracing, evaluating, and monitoring autonomous, stateful agents, enabling continuous learning and self-improvement. It captures runtime data from agent-environment interactions, targeting developers and researchers building and deploying AI agents.
How It Works
Judgeval integrates via a Python SDK to automatically trace agent execution, capturing inputs, outputs, tool calls, latency, and custom metadata. This data can be exported for analysis, used to build custom evaluators (including LLM-as-a-judge), and trigger alerts for production monitoring. The approach facilitates debugging, performance bottleneck identification, and data-driven agent optimization.
Quick Start & Requirements
pip install judgeval
JUDGMENT_API_KEY
and JUDGMENT_ORG_ID
environment variables (or JUDGMENT_API_URL
for self-hosted).Highlighted Details
Maintenance & Community
Maintained by Judgment Labs. Community channels include Discord (https://discord.gg/tGVFf8UBUY) and X (https://x.com/JudgmentLabs).
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. This requires clarification for commercial use or closed-source linking.
Limitations & Caveats
The license is not specified, which is a significant blocker for determining commercial usability. The core functionality relies on connecting to the Judgment Platform, requiring API keys or a self-hosted instance.
18 hours ago
Inactive