Curated list of human preference datasets for LLM training
Top 77.4% on sourcepulse
This repository curates human preference datasets crucial for fine-tuning Large Language Models (LLMs), particularly for Reinforcement Learning from Human Feedback (RLHF) and evaluation. It serves researchers and developers aiming to align LLM behavior with human values and preferences, offering a centralized resource for high-quality, human-annotated data.
How It Works
The list compiles datasets derived from various sources, including direct human annotations, comparisons of model-generated outputs, and crowd-sourced conversational data. These datasets typically contain prompts, multiple model responses, and human-assigned preference scores or quality ratings, enabling the training of reward models and the evaluation of LLM alignment.
Quick Start & Requirements
datasets
from HuggingFace.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The ShareGPT.com data access via API is currently disabled due to excess traffic. Some datasets may have specific access requirements or usage restrictions detailed in their respective licenses.
1 year ago
Inactive