llm_rules  by normster

Benchmark for evaluating LLM rule-following capabilities

Created 2 years ago
253 stars

Top 99.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

RuLES (Rule-following Language Evaluation Scenarios) is a benchmark designed to rigorously evaluate the rule-following capabilities of Large Language Models (LLMs). It addresses the critical need for understanding LLM reliability and safety by providing a systematic way to test adherence to instructions. This benchmark is valuable for researchers and developers seeking to quantify and improve LLM behavior.

How It Works

The project implements a comprehensive benchmark suite with diverse test cases targeting various aspects of rule-following. It supports evaluation of models via APIs (OpenAI, Anthropic, Google VertexAI) and locally hosted models using vLLM. The core approach involves running LLMs against these carefully crafted scenarios and analyzing their outputs to derive rule-following scores, offering a novel and precise method for assessing a key LLM limitation.

Quick Start & Requirements

  • Installation: Install as an editable package: pip install -e .. For API wrappers, use pip install -e .[models].
  • Prerequisites: API keys for OpenAI, Anthropic, or Google (configured via .env file), local HuggingFace models (e.g., Llama-2, downloaded via snapshot_download), Python. GPU is required for vLLM evaluation and GCG attack.
  • Links: [demo], [website], [paper] (refer to description).
  • Setup: Requires model downloads and API key configuration. GPU hardware is a significant requirement for certain evaluation modes.

Highlighted Details

  • Features a revised v2.0 benchmark with new test cases (updated March 2024).
  • Updated to v3.0.0 (September 2024) with prompt wording adjustments; previous results may not be directly comparable.
  • Supports HuggingFace chat templates and includes scripts for re-evaluation, visualization, GCG attacks, and fine-tuning experiments.
  • Bug fixes and added support for Google VertexAI API models (June 2024).

Maintenance & Community

The project shows active maintenance with multiple updates in 2024, including significant revisions to the benchmark and evaluation scripts. No specific community channels (e.g., Discord, Slack) are listed in the provided README text.

Licensing & Compatibility

The license type is not explicitly stated in the provided README text, which is a critical omission for assessing commercial use or closed-source linking compatibility.

Limitations & Caveats

  • A GPU is mandatory for running evaluations using vLLM and for executing the GCG attack.
  • Users should be aware that prompt wording changes between benchmark versions (v2.0, v3.0.0) can affect result comparability.
  • The absence of a specified license prevents definitive statements on compatibility.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.