AnglE by SeanLee97

Sentence embedding framework for training/inference using BERT/LLM backbones

Created 2 years ago

568 stars

Top 56.6% on SourcePulse

View on GitHub

3 Experts Love This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

AnglE is a Python library for training and inferring powerful sentence embeddings, targeting researchers and developers building NLP applications requiring semantic similarity. It offers state-of-the-art performance on benchmarks like STS and MTEB, enabling efficient and high-quality text representation.

How It Works

AnglE utilizes an angle-optimized loss function (ACL24) alongside contrastive, CoSENT, and Espresso losses. It supports a wide range of backbones, including BERT-based models (BERT, RoBERTa) and LLMs (LLaMA, Mistral, Qwen), including bidirectional variants. This flexibility allows users to leverage diverse architectures for optimal embedding generation, with a focus on angular relationships for improved semantic capture.

Quick Start & Requirements

Install via pip: python -m pip install -U angle-emb
Requires Python and PyTorch. GPU acceleration is recommended for training and inference.
Official documentation: https://angle.readthedocs.io/en/latest/index.html

Highlighted Details

Achieved state-of-the-art (SOTA) on MTEB Leaderboard with models trained using AnglE.
Supports training with single or multiple GPUs.
Offers a variety of loss functions and backbone model integrations.
Includes tools for custom training and fine-tuning with different dataset formats.

Maintenance & Community

The project is actively developed, with recent updates including support for Espresso Sentence Embeddings and training with positive pairs only. The primary contact is xmlee97@gmail.com.

Licensing & Compatibility

The project is licensed under the MIT License. Pretrained models may have their own licenses. Compatibility for commercial use depends on the specific pretrained model's license.

Limitations & Caveats

While versatile, fine-tuning tips suggest specific loss weight adjustments based on dataset format, indicating potential sensitivity to configuration. The README notes that Sentence-Transformers' implementation of AnglE loss is partial and may not perform as well as the official code.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days