inspect_ai  by UKGovernmentBEIS

Framework for large language model evaluations

Created 2 years ago
1,459 stars

Top 28.1% on SourcePulse

GitHubView on GitHub
Project Summary

Inspect is a Python framework for evaluating large language models (LLMs), developed by the UK AI Security Institute. It offers built-in components for prompt engineering, tool usage, multi-turn dialogue, and model-graded evaluations, enabling users to systematically assess LLM performance.

How It Works

Inspect provides a modular architecture allowing extensions via other Python packages. This design facilitates the integration of new elicitation and scoring techniques, promoting flexibility and extensibility in LLM evaluation methodologies.

Quick Start & Requirements

  • Install with: pip install -e ".[dev]"
  • Development setup requires cloning the repository and installing optional dependencies.
  • Pre-commit hooks can be installed via make hooks.
  • Linting, formatting, and tests are available via make check and make test.
  • Recommended VS Code extensions include Python, Ruff, and MyPy.
  • Official documentation is available at https://inspect.aisi.org.uk/.

Highlighted Details

  • Comprehensive framework for LLM evaluations.
  • Built-in support for prompt engineering, tool usage, and multi-turn dialogue.
  • Facilitates model-graded evaluations.
  • Extensible architecture for custom elicitation and scoring techniques.

Maintenance & Community

The project is developed by the UK AI Security Institute. Further community engagement details are not specified in the README.

Licensing & Compatibility

The license is not specified in the README.

Limitations & Caveats

The README does not specify licensing details, which may impact commercial use or closed-source integration.

Health Check
Last Commit

11 hours ago

Responsiveness

1 day

Pull Requests (30d)
140
Issues (30d)
47
Star History
101 stars in the last 30 days

Explore Similar Projects

Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
2 more.

ama_prompting by HazyResearch

0%
545
Language model prompting strategy research paper
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.