ruozhiba  by Leymore

Dataset for LLM entertainment using Zhi Zhang posts

Created 2 years ago
745 stars

Top 46.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a dataset of posts from the Baidu "Ruozhiba" (Weak-minded Bar) forum, intended to inspire creative and entertaining uses of Large Language Models (LLMs) like ChatGPT. It is primarily for researchers and developers exploring novel LLM applications.

How It Works

The project curates and organizes posts from the Ruozhiba forum, categorizing them by quality and type (full posts or titles). This structured data serves as a unique corpus for training or fine-tuning LLMs, enabling them to generate humorous, nonsensical, or creatively "weak-minded" text, thereby exploring the boundaries of LLM creativity and safety.

Highlighted Details

  • Dataset includes 1.3k annual best posts (18-21), 2.6k recommended titles (up to 2023.04.30), and 81.7k general titles (up to 2023.04.30).
  • A separate collection of 2.4k question-type posts is available via a Tencent Docs link.
  • The data is intended to spark ideas for entertaining LLM usage.

Maintenance & Community

The project acknowledges the administrators and members of the Ruozhiba forum for their content contributions. No specific community channels or active maintenance indicators are provided.

Licensing & Compatibility

The repository does not specify a license. The data is sourced from a public forum, but its use for commercial purposes or integration into closed-source projects may require further investigation into the forum's terms of service and copyright.

Limitations & Caveats

The dataset is specific to the "Ruozhiba" forum's unique content style and may not generalize well to other domains. The lack of a specified license poses potential legal and compatibility issues for downstream use.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.