fake-news-detector  by CaptainYifei

Automated fake news detection system using AI and evidence search

Created 7 months ago
419 stars

Top 70.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an automated fake news detection system leveraging AI and evidence search. It targets users needing to verify information accuracy by extracting claims, searching for supporting evidence online, and performing semantic analysis using large language and embedding models. The system offers a real-time, step-by-step verification process via a Streamlit web interface, aiding quick decision-making on news credibility.

How It Works

The system employs a multi-stage pipeline. It first uses a Large Language Model (LLM), such as Qwen2.5, to extract verifiable claims from input news text. Subsequently, it queries the DuckDuckGo search engine to gather relevant evidence. The BGE-M3 embedding model then calculates semantic similarity between the extracted claims and the retrieved evidence, identifying the most pertinent information. Finally, based on this evidence, the system provides a judgment on the news's veracity, detailing the reasoning process.

Quick Start & Requirements

  • Primary Install/Run: Clone the repository, install dependencies via pip install -r requirements.txt, and run the application using streamlit run app.py.
  • Prerequisites: Python 3.12 is required. Users must have access to a compatible LLM (e.g., local Qwen2.5-14B or an OpenAI-compatible API) and the BGE-M3 embedding model (which can be locally deployed or accessed via API). Model paths may need configuration in fact_checker.py.
  • Links: GitHub: https://github.com/CaptainYifei/fake-news-detector

Highlighted Details

  • Automated extraction of verifiable claims from news articles.
  • Real-time evidence gathering via DuckDuckGo search.
  • Advanced semantic relevance ranking using the BGE-M3 embedding model.
  • Streaming interface provides a transparent, step-by-step view of the fact-checking process.

Maintenance & Community

The project welcomes contributions via standard GitHub pull requests. Links to the GitHub repository are provided for issue tracking and code.

Licensing & Compatibility

The project is released under the MIT License, which generally permits broad use, modification, and distribution, including for commercial purposes, with minimal restrictions.

Limitations & Caveats

The system relies on the availability and quality of external search results and the accuracy of the configured LLM and embedding models. Local deployment of the Qwen2.5 and BGE-M3 models may require significant computational resources and specific hardware configurations. The README does not detail performance benchmarks or specific hardware requirements beyond Python version.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
66 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.