Survey paper resource for LLM evaluation
Top 27.4% on sourcepulse
This repository serves as a comprehensive, curated collection of academic papers and resources focused on the evaluation of Large Language Models (LLMs). It aims to provide researchers and practitioners with a structured overview of the evolving landscape of LLM assessment across various domains and tasks.
How It Works
The project organizes papers and resources based on the categories outlined in the survey paper "A Survey on Evaluation of Large Language Models." This structured approach allows users to navigate and discover relevant research concerning what aspects of LLMs to evaluate (e.g., natural language understanding, reasoning, robustness, ethics) and where to evaluate them using specific benchmarks.
Quick Start & Requirements
This repository is a collection of research papers and does not have a direct installation or execution command. Users can browse the listed papers and access their associated links (arXiv, GitHub, etc.) for further details.
Highlighted Details
Maintenance & Community
The project is maintained by the authors of the survey paper, with acknowledgments for contributions from Tahmid Rahman, Hao Zhao, Chenhui Zhang, Damien Sileo, Peiyi Wang, Zengzhi Wang, Kenneth Leung, Aml-Hassan-Abd-El-hamid, and Taicheng Guo.
Licensing & Compatibility
The repository itself does not specify a license, but it curates links to academic papers, which are typically governed by their respective publication licenses or terms of use.
Limitations & Caveats
As a survey and resource collection, this repository does not provide executable code or evaluation tools itself. Users must refer to the linked papers and projects for implementation details and usage.
2 months ago
1 week