LLM classifier for instant data classification using Llama 2
Top 94.7% on sourcepulse
This project provides an LLM-based classifier that allows users to categorize data using natural language prompts, eliminating the need for labeled datasets. It's designed for users who want to quickly build custom classifiers without extensive data preparation or hyperparameter tuning, leveraging prompt engineering as the primary method for defining classes.
How It Works
The classifier leverages the Llama 2 LLM to generate synthetic training data from user-provided prompts, effectively creating "piles" of examples for each class. It then fine-tunes specialized LLMs derived from Llama 2 to distinguish between these generated data piles. This approach bypasses manual data labeling and allows for classifier customization through prompt refinement.
Quick Start & Requirements
pip install lamini
or clone the repository and use provided shell scripts.Highlighted Details
Maintenance & Community
This appears to be a hackathon project with ongoing refinement. Feedback is encouraged for improvements.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial or closed-source use is not specified.
Limitations & Caveats
The project is described as a "week night hackathon project" with known limitations, including inefficient batching for training on many classes and ongoing refinement of LLM example generators. The accuracy of generated examples depends on prompt quality.
1 year ago
Inactive