LLM reasoning benchmark for evaluating responses to misleading prompts
Top 70.2% on sourcepulse
This repository provides a curated collection of "trick questions" designed to probe and challenge the reasoning capabilities of Large Language Models (LLMs). It offers variations on classic logic puzzles, riddles, and paradoxes, modified to expose common LLM failure modes such as the Einstellungseffekt (fixation on familiar patterns) and conjunction fallacy. The goal is to create a benchmark for evaluating LLM robustness against misleading information and to encourage the development of more reliable reasoning systems.
How It Works
The project presents modified versions of well-known problems (e.g., Trolley Problem, Monty Hall, River Crossing) where subtle changes are introduced to disrupt standard LLM responses. These modifications aim to prevent LLMs from simply recalling pre-trained solutions, forcing them instead to engage in step-by-step logical deduction. The README details specific examples, highlighting how LLMs often fail by applying solutions to the original, unmodified problems or by generating overly complex, irrelevant reasoning chains.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The repository's license is not specified, which may pose a restriction for commercial use or redistribution. The effectiveness of prompts can vary significantly based on the specific LLM architecture and its training data.
2 days ago
1 day