setfit  by huggingface

Few-shot learning framework for Sentence Transformers

Created 3 years ago
2,566 stars

Top 18.2% on SourcePulse

GitHubView on GitHub
Project Summary

SetFit is an efficient, prompt-free framework for few-shot fine-tuning of Sentence Transformers, targeting developers and researchers needing high-accuracy text classification with minimal labeled data. It offers faster training and multilingual capabilities compared to prompt-based methods.

How It Works

SetFit leverages Sentence Transformers to generate rich text embeddings directly, bypassing the need for handcrafted prompts or verbalizers. It employs a two-stage fine-tuning process: first, it trains a classification head on generated embeddings, and then it fine-tunes the entire Sentence Transformer model using these initial predictions. This approach yields competitive accuracy with significantly less data and computation.

Quick Start & Requirements

  • Install via pip: pip install setfit
  • For bleeding-edge: pip install git+https://github.com/huggingface/setfit.git
  • Requires Python 3.9+ (for developer install).
  • See Documentation and quickstart for detailed examples.

Highlighted Details

  • Achieves high accuracy with as few as 8 labeled examples per class.
  • Significantly faster training and inference than large prompt-based models.
  • Supports multilingual classification by using any Sentence Transformer checkpoint.
  • Integrates seamlessly with the Hugging Face Hub for model sharing and loading.

Maintenance & Community

  • Developed by Hugging Face.
  • Code formatting enforced by black and isort.
  • See notebooks and tutorials for more examples.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is primarily focused on text classification tasks. While it supports multilingual models, performance may vary across languages depending on the underlying Sentence Transformer checkpoint.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
20 stars in the last 30 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Eugene Yan Eugene Yan(AI Scientist at AWS), and
14 more.

text by pytorch

0.0%
4k
PyTorch library for NLP tasks
Created 8 years ago
Updated 1 week ago
Feedback? Help us improve.