Prompt-based framework for topic modeling research
Top 84.2% on sourcepulse
TopicGPT provides a prompt-based framework for topic modeling, enabling users to generate hierarchical topics, refine them, and assign them to documents using large language models. It is designed for researchers and practitioners in natural language processing and data analysis who need a flexible and powerful approach to uncovering thematic structures in text data.
How It Works
TopicGPT leverages large language models (LLMs) through a series of prompting strategies to perform topic modeling. It first generates high-level topics, then drills down into more specific, low-level topics within each high-level category. The framework includes functions to refine topics by merging similar ones and removing irrelevant ones, and to assign topics to documents with supporting quotes. This approach allows for dynamic topic generation and adaptation without requiring pre-defined vocabularies or extensive parameter tuning, offering a more intuitive and potentially more nuanced understanding of text content.
Quick Start & Requirements
pip install topicgpt_python
.jsonl
format with "text" field.Highlighted Details
generate_topic_lvl1
, generate_topic_lvl2
, refine_topics
, assign_topics
, and correct_topics
.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
4 months ago
1 week