BERT-flow by bohanli

TensorFlow code for sentence embeddings research paper

Created 5 years ago

535 stars

Top 59.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

This repository provides a TensorFlow implementation of the EMNLP 2020 paper "On the Sentence Embeddings from Pre-trained Language Models." It offers a method to improve sentence embeddings derived from pre-trained language models like BERT, targeting researchers and practitioners in Natural Language Processing seeking enhanced semantic representation for sentences. The key benefit is achieving state-of-the-art performance on sentence similarity tasks.

How It Works

The project implements a "flow" mechanism, a generative model approach, to refine sentence embeddings. This involves fine-tuning pre-trained BERT models using Natural Language Inference (NLI) supervision. The core idea is to learn a transformation (the "flow") that maps BERT's raw sentence representations to a more semantically meaningful space, improving performance on tasks like semantic textual similarity (STS).

Quick Start & Requirements

Install: Clone the repository and set up environment variables for model and data directories.
Prerequisites: Python >= 3.6, TensorFlow >= 1.14. Requires downloading pre-trained BERT models (base and large) and GLUE benchmark datasets (specifically STS-B).
Setup: Requires downloading models and datasets, which can take time depending on network speed and file sizes.
Links: BERT Models, GLUE Benchmark, SentEval.

Highlighted Details

Achieves Spearman's rho of 81.18 on STS-B using BERT-large-NLI-flow (trained on target data).
Supports fine-tuning BERT with NLI supervision for improved embeddings.
Enables unsupervised learning of flow-based generative models for sentence embeddings.
Provides scripts for both training and prediction/evaluation.

Maintenance & Community

The project is associated with authors from CMU. Contact information is provided for questions. No explicit community channels (like Discord/Slack) or roadmap are mentioned.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it acknowledges borrowing heavily from projects like google-research/bert, zihangdai/xlnet, and tensorflow/tensor2tensor, which have varying licenses. Compatibility for commercial use or closed-source linking would require clarification of the specific license applied to this codebase.

Limitations & Caveats

The implementation is specific to TensorFlow 1.x. The README does not detail support for newer TensorFlow versions or other frameworks like PyTorch. The setup involves manual downloading of large pre-trained models and datasets.

Health Check

Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days