Framework for uncertainty estimation in LLM text generation
Top 86.7% on sourcepulse
LM-Polygraph provides a comprehensive Python framework for evaluating uncertainty estimation (UE) methods in Large Language Models (LLMs) for text generation. It aims to make LLM applications safer by identifying potential hallucinations through confidence scores, targeting researchers and developers working with LLMs.
How It Works
The framework supports "white-box" (full model access), "grey-box" (access to token probabilities via logprobs), and "black-box" (API-based) model interactions. It implements a wide array of state-of-the-art UE techniques, categorized into information-based, meaning diversity, ensembling, and density-based methods. This flexible architecture allows for consistent benchmarking and integration with various LLM architectures and APIs.
Quick Start & Requirements
pip install lm-polygraph
git clone ... && cd lm-polygraph && pip install .
(checkout to a stable release tag recommended).OPENAI_BASE_URL
and OPENAI_API_KEY
.Highlighted Details
Maintenance & Community
The project is associated with EMNLP 2023 and has recent arXiv publications, indicating active development. Links to community channels are not explicitly provided in the README.
Licensing & Compatibility
The project appears to be under a permissive license, likely MIT or Apache, based on common open-source practices, but a specific license is not explicitly stated in the README. This suggests good compatibility for commercial use and integration into closed-source projects.
Limitations & Caveats
The README notes that code from the main branch may be unstable. Some UE methods require training data, and certain demo applications might require Colab Pro for larger models due to memory constraints.
1 week ago
1 week