AnglE  by SeanLee97

Sentence embedding framework for training/inference using BERT/LLM backbones

Created 1 year ago
555 stars

Top 57.7% on SourcePulse

GitHubView on GitHub
Project Summary

AnglE is a Python library for training and inferring powerful sentence embeddings, targeting researchers and developers building NLP applications requiring semantic similarity. It offers state-of-the-art performance on benchmarks like STS and MTEB, enabling efficient and high-quality text representation.

How It Works

AnglE utilizes an angle-optimized loss function (ACL24) alongside contrastive, CoSENT, and Espresso losses. It supports a wide range of backbones, including BERT-based models (BERT, RoBERTa) and LLMs (LLaMA, Mistral, Qwen), including bidirectional variants. This flexibility allows users to leverage diverse architectures for optimal embedding generation, with a focus on angular relationships for improved semantic capture.

Quick Start & Requirements

Highlighted Details

  • Achieved state-of-the-art (SOTA) on MTEB Leaderboard with models trained using AnglE.
  • Supports training with single or multiple GPUs.
  • Offers a variety of loss functions and backbone model integrations.
  • Includes tools for custom training and fine-tuning with different dataset formats.

Maintenance & Community

The project is actively developed, with recent updates including support for Espresso Sentence Embeddings and training with positive pairs only. The primary contact is xmlee97@gmail.com.

Licensing & Compatibility

The project is licensed under the MIT License. Pretrained models may have their own licenses. Compatibility for commercial use depends on the specific pretrained model's license.

Limitations & Caveats

While versatile, fine-tuning tips suggest specific loss weight adjustments based on dataset format, indicating potential sensitivity to configuration. The README notes that Sentence-Transformers' implementation of AnglE loss is partial and may not perform as well as the official code.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.