DoLa  by voidism

Decoding strategy research paper for improving factuality in LLMs

Created 2 years ago
513 stars

Top 60.9% on SourcePulse

GitHubView on GitHub
Project Summary

DoLa is an official implementation for a decoding strategy that enhances the factuality of Large Language Models (LLMs) without requiring additional fine-tuning or external knowledge. It targets researchers and practitioners aiming to reduce hallucinations in LLM outputs, offering improved truthfulness across various benchmarks.

How It Works

DoLa leverages the observation that factual knowledge in LLMs is often localized to specific transformer layers. It achieves improved factuality by contrasting the logit differences between projections from later (mature) layers and earlier (candidate premature) layers to the vocabulary space. This contrastive approach helps surface factual knowledge more effectively, leading to more truthful generations.

Quick Start & Requirements

  • Install: pip install -e transformers-4.28.1 datasets accelerate
  • Dependencies: Python, transformers, datasets, accelerate. OpenAI API key required for specific evaluations.
  • Hardware: Supports LLaMA-v1 models (7B, 13B, 30B, 65B), with GPU requirements scaling with model size (e.g., 1 GPU for 7B, 8 GPUs for 65B).
  • Documentation: Paper available at https://arxiv.org/abs/2309.03883.

Highlighted Details

  • Improves TruthfulQA performance by 12-17% absolute points for LLaMA models.
  • Supports evaluation on FACTOR, TruthfulQA (multiple choice and open-ended), GSM8K, StrategyQA, and GPT-4 evaluation benchmarks.
  • Decoding strategy is controlled via the --early-exit-layers argument, specifying layer indices for contrastive decoding.

Maintenance & Community

The project is associated with authors from MIT and Microsoft. Links to related repositories like FastChat and ContrastiveDecoding are provided.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

Currently supports only LLaMA-v1 models. The README mentions fine-tuning GPT-3 models via OpenAI API for certain evaluations, which incurs costs and requires API access.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

10.6%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 1 week ago
Feedback? Help us improve.