Factuality detection tool for generative AI
Top 41.6% on sourcepulse
FacTool is a framework for detecting factual errors in text generated by large language models across four domains: knowledge-based QA, code generation, mathematical reasoning, and scientific literature review. It assists researchers and developers in evaluating and improving the factual accuracy of LLM outputs.
How It Works
FacTool employs a tool-augmented approach, leveraging external tools and LLM-based reasoning to verify claims. For knowledge-based QA, it uses search engines (Serper) and web scrapers to find evidence. For code generation, it checks for execution errors. For math, it verifies calculations. For scientific literature, it validates citations against actual publications. The framework breaks down responses into claims, generates queries for verification, retrieves evidence, and assesses factuality at both claim and response levels.
Quick Start & Requirements
pip install factool
Highlighted Details
Maintenance & Community
The project is associated with GAIR-NLP and has contributions from multiple authors listed in the citation. The primary citation is available for reference.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The framework relies heavily on external API keys (OpenAI, Serper, Scraper), incurring costs. The factuality assessment accuracy is dependent on the quality of the underlying LLM and the effectiveness of the verification tools. The scientific literature review module showed a 0% response-level factuality in the provided example, indicating potential limitations in citation verification.
11 months ago
1 day