bert-embedding  by imgarylai

Deprecated tool for generating token-level embeddings from BERT models

created 6 years ago
451 stars

Top 67.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides token-level embeddings from BERT models using MXNet and GluonNLP, targeting NLP researchers and developers who want to leverage pre-trained language representations without full end-to-end model fine-tuning. It offers a simpler way to integrate BERT's powerful contextual embeddings into existing NLP pipelines.

How It Works

The library extracts token embeddings from pre-trained BERT models. It leverages the MXNet deep learning framework and the GluonNLP toolkit for model loading and inference. Users can specify different pre-trained BERT models (e.g., bert_12_768_12, bert_24_1024_16) and handle Out-Of-Vocabulary (OOV) tokens using averaging, summation, or the last token's embedding.

Quick Start & Requirements

  • Install: pip install bert-embedding
  • GPU Support: Requires mxnet-cu92 (or compatible MXNet GPU version).
  • Usage: Instantiate BertEmbedding and call it with a list of sentences. GPU usage requires setting the MXNet context.
  • Documentation: README

Highlighted Details

  • Supports multiple pre-trained BERT models, including uncased and cased versions from book corpus and Wikipedia.
  • Offers flexibility in handling OOV tokens ('avg', 'sum', 'last').
  • Embeddings are 768-dimensional for standard models.

Maintenance & Community

The project is marked as deprecated by the author due to lack of maintenance time. The author is open to contributions from interested maintainers.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The project is deprecated and no longer actively maintained. The specific MXNet GPU version (mxnet-cu92) might be outdated, potentially requiring manual dependency management for newer CUDA versions.

Health Check
Last commit

5 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.