Framework for large language model evaluations
Top 33.6% on sourcepulse
Inspect is a Python framework for evaluating large language models (LLMs), developed by the UK AI Security Institute. It offers built-in components for prompt engineering, tool usage, multi-turn dialogue, and model-graded evaluations, enabling users to systematically assess LLM performance.
How It Works
Inspect provides a modular architecture allowing extensions via other Python packages. This design facilitates the integration of new elicitation and scoring techniques, promoting flexibility and extensibility in LLM evaluation methodologies.
Quick Start & Requirements
pip install -e ".[dev]"
make hooks
.make check
and make test
.Highlighted Details
Maintenance & Community
The project is developed by the UK AI Security Institute. Further community engagement details are not specified in the README.
Licensing & Compatibility
The license is not specified in the README.
Limitations & Caveats
The README does not specify licensing details, which may impact commercial use or closed-source integration.
23 hours ago
1 day