AnglE  by SeanLee97

Sentence embedding framework for training/inference using BERT/LLM backbones

created 1 year ago
549 stars

Top 59.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

AnglE is a Python library for training and inferring powerful sentence embeddings, targeting researchers and developers building NLP applications requiring semantic similarity. It offers state-of-the-art performance on benchmarks like STS and MTEB, enabling efficient and high-quality text representation.

How It Works

AnglE utilizes an angle-optimized loss function (ACL24) alongside contrastive, CoSENT, and Espresso losses. It supports a wide range of backbones, including BERT-based models (BERT, RoBERTa) and LLMs (LLaMA, Mistral, Qwen), including bidirectional variants. This flexibility allows users to leverage diverse architectures for optimal embedding generation, with a focus on angular relationships for improved semantic capture.

Quick Start & Requirements

Highlighted Details

  • Achieved state-of-the-art (SOTA) on MTEB Leaderboard with models trained using AnglE.
  • Supports training with single or multiple GPUs.
  • Offers a variety of loss functions and backbone model integrations.
  • Includes tools for custom training and fine-tuning with different dataset formats.

Maintenance & Community

The project is actively developed, with recent updates including support for Espresso Sentence Embeddings and training with positive pairs only. The primary contact is xmlee97@gmail.com.

Licensing & Compatibility

The project is licensed under the MIT License. Pretrained models may have their own licenses. Compatibility for commercial use depends on the specific pretrained model's license.

Limitations & Caveats

While versatile, fine-tuning tips suggest specific loss weight adjustments based on dataset format, indicating potential sensitivity to configuration. The README notes that Sentence-Transformers' implementation of AnglE loss is partial and may not perform as well as the official code.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Feedback? Help us improve.