MisguidedAttention  by cpldcpu

LLM reasoning benchmark for evaluating responses to misleading prompts

created 1 year ago
429 stars

Top 70.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a curated collection of "trick questions" designed to probe and challenge the reasoning capabilities of Large Language Models (LLMs). It offers variations on classic logic puzzles, riddles, and paradoxes, modified to expose common LLM failure modes such as the Einstellungseffekt (fixation on familiar patterns) and conjunction fallacy. The goal is to create a benchmark for evaluating LLM robustness against misleading information and to encourage the development of more reliable reasoning systems.

How It Works

The project presents modified versions of well-known problems (e.g., Trolley Problem, Monty Hall, River Crossing) where subtle changes are introduced to disrupt standard LLM responses. These modifications aim to prevent LLMs from simply recalling pre-trained solutions, forcing them instead to engage in step-by-step logical deduction. The README details specific examples, highlighting how LLMs often fail by applying solutions to the original, unmodified problems or by generating overly complex, irrelevant reasoning chains.

Quick Start & Requirements

  • Usage: Primarily for prompt engineering and LLM evaluation. No specific installation required; prompts are directly used with LLM interfaces.
  • Requirements: Access to an LLM.
  • Resources: Minimal; requires only text input and LLM processing.
  • Links:
    • Evaluation results: evaluation folder
    • Original problem references: Linked within the README (e.g., Wikipedia).

Highlighted Details

  • Einstellungseffekt: Demonstrates how LLMs can be susceptible to recognizing familiar problem structures and applying incorrect, pre-learned solutions.
  • Interactive Evaluation: Includes results from interactive evaluations to track LLM performance improvements over time.
  • Prompt Variations: Offers a wide range of modified puzzles, including logic, riddles, and probability problems, with clear explanations of the intended LLM failure modes.
  • Community Contributions: Actively encourages contributions of new prompts and improvements, fostering a collaborative approach to LLM evaluation.

Maintenance & Community

  • Activity: Last updated January 2025.
  • Contributions: Features contributions from various users, noted with GitHub handles.
  • Community: Encourages interaction via GitHub Issues and Discussions.

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Prompts are plain text and compatible with any LLM interface.

Limitations & Caveats

The repository's license is not specified, which may pose a restriction for commercial use or redistribution. The effectiveness of prompts can vary significantly based on the specific LLM architecture and its training data.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.