GPT models for semantic search, code, and pretrained models
Top 42.2% on sourcepulse
SGPT provides pre-trained GPT models for semantic search, offering both Bi-Encoder and Cross-Encoder approaches for symmetric and asymmetric search tasks. It's designed for researchers and developers looking to leverage large language models for efficient and accurate information retrieval.
How It Works
SGPT-BE fine-tunes GPT models using contrastive learning on bias tensors and position-weighted mean pooling to generate semantically rich sentence embeddings. SGPT-CE utilizes GPT models' log probabilities without fine-tuning, directly assessing the likelihood of a query given a document. This dual approach allows for flexibility in balancing performance and computational cost.
Quick Start & Requirements
pip install --upgrade git+https://github.com/UKPLab/sentence-transformers.git
. For specific SGPT pooling, use pip install --upgrade git+https://github.com/Muennighoff/sentence-transformers.git@sgpt_poolings_specb
.Muennighoff/SGPT-5.8B-weightedmean-nli-bitfit
).Highlighted Details
sentence-transformers
library.Maintenance & Community
The project is actively updated, with recent releases including GRIT and GritLM models that unify previous SGPT architectures. The author, Niklas Muennighoff, is a notable contributor in the NLP space. Further updates and model requests can be made via GitHub issues.
Licensing & Compatibility
The project's models are generally available under permissive licenses compatible with commercial use, but specific model licenses on Hugging Face should be verified. The code itself appears to be MIT licensed.
Limitations & Caveats
Larger SGPT models require substantial GPU resources (e.g., >24GB VRAM for 5.8B models). While the paper claims state-of-the-art performance on benchmarks like BEIR and USEB, users should verify performance on their specific use cases. The project recommends newer GRIT/GritLM models, suggesting potential future deprecation of older SGPT models.
1 year ago
1 day