Dataset and code for refreshing LLMs with search
Top 77.8% on sourcepulse
This repository provides the dataset and code for FreshLLMs, a method for refreshing Large Language Models (LLMs) with search engine augmentation. It is relevant for LLM researchers and developers aiming to improve model factuality and up-to-dateness, offering a structured approach to data collection and evaluation.
How It Works
The project centers around the FreshQA dataset, a continuously updated collection of questions and answers designed to evaluate LLM factuality. It also introduces FreshEval, an automatic evaluation metric that leverages few-shot in-context learning with LLMs to assess response quality, aiming to mimic human judgment for factuality.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project acknowledges several contributors for both dataset updates and original creation. SerpApi is a sponsor, providing search credits for FreshPrompt users.
Licensing & Compatibility
The repository does not explicitly state a license. The provided citation is for an arXiv paper. Commercial use implications are not detailed.
Limitations & Caveats
The FreshEval metric's accuracy is dependent on the chosen LLM and its API access. The README notes that gpt-4-1106-preview
is recommended over gpt-4-0125-preview
for FreshEval due to slightly better agreement with human annotations in their evaluation.
6 days ago
1 week