setfit by huggingface

Few-shot learning framework for Sentence Transformers

Created 3 years ago

2,663 stars

Top 17.6% on SourcePulse

View on GitHub

11 Experts Love This Project

Clement Delangue

Cofounder of Hugging Face

Luis Capelo

Cofounder of Lightning AI

Patrick von Platen

Author of Hugging Face Diffusers; Research Engineer at Mistral

Travis Fischer

Founder of Agentic

and 7 more!

Project Summary

SetFit is an efficient, prompt-free framework for few-shot fine-tuning of Sentence Transformers, targeting developers and researchers needing high-accuracy text classification with minimal labeled data. It offers faster training and multilingual capabilities compared to prompt-based methods.

How It Works

SetFit leverages Sentence Transformers to generate rich text embeddings directly, bypassing the need for handcrafted prompts or verbalizers. It employs a two-stage fine-tuning process: first, it trains a classification head on generated embeddings, and then it fine-tunes the entire Sentence Transformer model using these initial predictions. This approach yields competitive accuracy with significantly less data and computation.

Quick Start & Requirements

Install via pip: pip install setfit
For bleeding-edge: pip install git+https://github.com/huggingface/setfit.git
Requires Python 3.9+ (for developer install).
See Documentation and quickstart for detailed examples.

Highlighted Details

Achieves high accuracy with as few as 8 labeled examples per class.
Significantly faster training and inference than large prompt-based models.
Supports multilingual classification by using any Sentence Transformer checkpoint.
Integrates seamlessly with the Hugging Face Hub for model sharing and loading.

Maintenance & Community

Developed by Hugging Face.
Code formatting enforced by black and isort.
See notebooks and tutorials for more examples.

Licensing & Compatibility

Licensed under Apache 2.0.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is primarily focused on text classification tasks. While it supports multilingual models, performance may vary across languages depending on the underlying Sentence Transformer checkpoint.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

28 stars in the last 30 days