hallbayes  by leochlon

LLM hallucination risk calculator and prompt re-engineering toolkit

Created 2 weeks ago

New!

981 stars

Top 37.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This toolkit addresses LLM hallucination risk by providing post-hoc calibration and prompt re-engineering capabilities, specifically for OpenAI models. It enables users to quantify hallucination risk using the Expectation-level Decompression Law (EDFL) and make informed ANSWER/REFUSE decisions under target Service Level Agreements (SLAs), offering transparent mathematical guarantees in "nats". The primary audience includes engineers and researchers seeking to improve LLM reliability and safety without retraining.

How It Works

The core approach leverages the EDFL principle to bound hallucination risk. It generates "rolling priors" by creating ensembles of content-weakened prompts (skeletons) from an original prompt. The system calculates an information budget ($\bar{\Delta}$) and uses two types of priors: an average prior ($\bar{q}$) for the EDFL risk bound and a worst-case prior ($q_{\text{lo}}$) for SLA gating. A decision to ANSWER is made if the information sufficiency ratio (ISR = $\bar{\Delta}$ / B2T) meets a threshold, ensuring conservative safety while providing realistic risk bounds. It supports evidence-based (context erasure) and closed-book (semantic masking) prior generation methods.

Quick Start & Requirements

  • Primary install: pip install --upgrade openai
  • Prerequisites: Requires an OpenAI API key (export OPENAI_API_KEY=sk-...). Utilizes the OpenAI Chat Completions API (e.g., gpt-4o, gpt-4o-mini). openai>=1.0.0 is necessary.
  • Links: Example scripts are provided within the repository for usage guidance.

Highlighted Details

  • Quantifies hallucination risk and provides explicit ANSWER/REFUSE decisions with transparent mathematical reasoning in "nats".
  • Supports two distinct modes for generating "rolling priors": Evidence-based (erasing/permuting context) and Closed-book (semantic masking of entities, numbers, titles).
  • Enables the generation of formal SLA certificates for auditability and compliance.
  • Offers detailed explanations for observed behaviors on different query types (e.g., arithmetic, factoids), framing them as safety-centric features.

Maintenance & Community

Developed by Hassana Labs (https://hassana.io). The implementation follows the framework described in the paper “Compression Failure in LLMs: Bayesian in Expectation, Not in Realization” (NeurIPS 2024 preprint) and related EDFL/ISR/B2T methodology. No community channels (e.g., Discord, Slack) or explicit roadmap links are provided in the README.

Licensing & Compatibility

Licensed under the MIT License. This permissive license generally allows for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

The toolkit is strictly limited to OpenAI models and the Chat Completions API. Arithmetic queries may exhibit unexpected abstentions due to pattern recognition leading to low information lift ($\bar{\Delta}$), requiring careful tuning or alternative event definitions (e.g., Correct/Incorrect). Aggressive clipping or prior collapse ($q_{\text{lo}} \to 0$) can also impact performance, necessitating adjustments to parameters like B_clip or the use of prior floors. The bus factor is unclear, as development is attributed to a single entity.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
5
Star History
992 stars in the last 17 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Travis Fischer Travis Fischer(Founder of Agentic), and
1 more.

HaluEval by RUCAIBox

0.8%
510
Benchmark dataset for LLM hallucination evaluation
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.