bleurt  by google-research

NLG metric based on transfer learning

created 5 years ago
746 stars

Top 47.5% on sourcepulse

GitHubView on GitHub
Project Summary

BLEURT is a Python library and command-line tool for evaluating Natural Language Generation (NLG) outputs. It provides a learned metric, based on BERT and RemBERT, that scores candidate sentences against references, aiming to capture fluency and meaning preservation. This is beneficial for researchers and developers needing robust NLG evaluation beyond traditional metrics like BLEU.

How It Works

BLEURT is a regression model trained on human ratings of sentence pairs. It leverages transfer learning from large language models (BERT, RemBERT) to understand semantic similarity and fluency. This approach allows it to learn nuanced quality judgments, outperforming simpler metrics by capturing more complex linguistic phenomena.

Quick Start & Requirements

Highlighted Details

  • Offers command-line, Python API, and TensorFlow API interfaces.
  • Supports fine-tuning on custom rating data for domain-specific evaluation.
  • BLEURT-20 checkpoint supports 13 languages and is multilingual.
  • Includes methods for speeding up inference, such as batching and distilled models.

Maintenance & Community

  • Developed by Google Research.
  • Latest recommended checkpoint (BLEURT-20) released Oct 2021.
  • Reproducibility details for papers are available.

Licensing & Compatibility

  • Apache 2.0 License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The default "test" checkpoint is noted as inaccurate; users should download recommended checkpoints. While BLEURT-20 supports multiple languages, its performance on languages not explicitly tested may vary. The distinction between adequacy and fluency in its scoring can be fuzzy due to training data characteristics.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
1 more.

RL4LMs by allenai

0.0%
2k
RL library to fine-tune language models to human preferences
created 3 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
11 more.

sentence-transformers by UKPLab

0.2%
17k
Framework for text embeddings, retrieval, and reranking
created 6 years ago
updated 3 days ago
Feedback? Help us improve.