llm_rules by normster

Benchmark for evaluating LLM rule-following capabilities

Created 2 years ago

255 stars

Top 98.8% on SourcePulse

View on GitHub

4 Experts Love This Project

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

Gabriel Almeida

Cofounder of Langflow

Ishaan Jaffer

Cofounder of LiteLLM

Dan Hendrycks

Author of MMLU; Executive Director at Center for AI Safety

Project Summary

Summary

RuLES (Rule-following Language Evaluation Scenarios) is a benchmark designed to rigorously evaluate the rule-following capabilities of Large Language Models (LLMs). It addresses the critical need for understanding LLM reliability and safety by providing a systematic way to test adherence to instructions. This benchmark is valuable for researchers and developers seeking to quantify and improve LLM behavior.

How It Works

The project implements a comprehensive benchmark suite with diverse test cases targeting various aspects of rule-following. It supports evaluation of models via APIs (OpenAI, Anthropic, Google VertexAI) and locally hosted models using vLLM. The core approach involves running LLMs against these carefully crafted scenarios and analyzing their outputs to derive rule-following scores, offering a novel and precise method for assessing a key LLM limitation.

Quick Start & Requirements

Installation: Install as an editable package: pip install -e .. For API wrappers, use pip install -e .[models].
Prerequisites: API keys for OpenAI, Anthropic, or Google (configured via .env file), local HuggingFace models (e.g., Llama-2, downloaded via snapshot_download), Python. GPU is required for vLLM evaluation and GCG attack.
Links: [demo], [website], [paper] (refer to description).
Setup: Requires model downloads and API key configuration. GPU hardware is a significant requirement for certain evaluation modes.

Highlighted Details

Features a revised v2.0 benchmark with new test cases (updated March 2024).
Updated to v3.0.0 (September 2024) with prompt wording adjustments; previous results may not be directly comparable.
Supports HuggingFace chat templates and includes scripts for re-evaluation, visualization, GCG attacks, and fine-tuning experiments.
Bug fixes and added support for Google VertexAI API models (June 2024).

Maintenance & Community

The project shows active maintenance with multiple updates in 2024, including significant revisions to the benchmark and evaluation scripts. No specific community channels (e.g., Discord, Slack) are listed in the provided README text.

Licensing & Compatibility

The license type is not explicitly stated in the provided README text, which is a critical omission for assessing commercial use or closed-source linking compatibility.

Limitations & Caveats

A GPU is mandatory for running evaluations using vLLM and for executing the GCG attack.
Users should be aware that prompt wording changes between benchmark versions (v2.0, v3.0.0) can affect result comparability.
The absence of a specified license prevents definitive statements on compatibility.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days