LLM-Factuality-Survey  by wangcunxiang

Survey paper on factuality in large language models

Created 1 year ago
340 stars

Top 81.1% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a comprehensive survey of factuality in Large Language Models (LLMs), detailing knowledge representation, retrieval augmentation, and domain-specific challenges. It is intended for researchers and practitioners in NLP and AI who need a structured overview of LLM factuality issues, existing solutions, and evaluation benchmarks.

How It Works

The survey categorizes factuality issues into model-level (e.g., knowledge deficit, reasoning errors) and retrieval-level causes (e.g., distraction, misinterpretation). It then explores various enhancement methods, including continual pre-training, supervised fine-tuning, and model editing, often supported by external knowledge sources. The paper also provides an extensive review of relevant datasets and evaluation metrics used to assess LLM factuality across different domains.

Quick Start & Requirements

This repository is a collection of survey information and does not have a direct installation or execution command. The primary resource is the linked arXiv paper for detailed content.

Highlighted Details

  • Comprehensive taxonomy of LLM factuality errors and their causes.
  • Extensive catalog of LLM factuality evaluation benchmarks and metrics.
  • Detailed review of enhancement methods for improving LLM factuality.
  • Analysis of domain-specific LLMs and their factuality challenges (e.g., medicine, law, finance).

Maintenance & Community

The repository is associated with the survey paper "Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity." Contributions via pull requests or issues are welcomed to improve the survey content.

Licensing & Compatibility

The repository itself does not specify a license. The survey paper is available on arXiv.

Limitations & Caveats

As a survey repository, it primarily aggregates and organizes information from other research papers. Real-time updates to the arXiv paper may not be reflected here.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Travis Fischer Travis Fischer(Founder of Agentic).

long-form-factuality by google-deepmind

0.2%
640
Benchmark for long-form factuality in LLMs
Created 1 year ago
Updated 1 month ago
Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Joe Walnes Joe Walnes(Head of Experimental Projects at Stripe), and
1 more.

KAG by OpenSPG

0.4%
8k
Logical reasoning framework for domain knowledge bases
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.