llm-hallucination-survey by HillZhang1999

Survey of hallucination in LLMs

Created 2 years ago

1,066 stars

Top 35.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

This repository serves as a comprehensive reading list and survey of research papers focused on hallucinations in Large Language Models (LLMs). It aims to provide researchers and practitioners with a structured overview of the problem, its various types, evaluation methods, sources, and mitigation strategies.

How It Works

The project categorizes LLM hallucinations into three main types: input-conflicting, context-conflicting, and fact-conflicting. It then meticulously lists and links to relevant research papers for each category, covering evaluation benchmarks, potential sources of hallucination, and diverse mitigation techniques applied during pretraining, fine-tuning, RLHF, and inference.

Quick Start & Requirements

This repository is a curated list of research papers and does not involve code execution or installation. All requirements are met by having internet access to view the linked papers.

Highlighted Details

Features a survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models."
Categorizes hallucinations into input-conflicting, context-conflicting, and fact-conflicting types.
Provides extensive lists of papers for evaluation, source analysis, and mitigation techniques.
Covers a wide range of mitigation strategies, including data curation, fine-tuning, RLHF, inference-time decoding, and external knowledge integration.

Maintenance & Community

The project is maintained by HillZhang1999. Contact is available via email for suggestions or contributions.

Licensing & Compatibility

The repository itself does not specify a license, but it links to numerous research papers, each with its own licensing and usage terms.

Limitations & Caveats

This is a curated list of research papers and does not provide code or tools for direct experimentation. The rapidly evolving nature of LLM research means new papers and findings may not be immediately reflected.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days