Deprecated tool for generating token-level embeddings from BERT models
Top 67.8% on sourcepulse
This project provides token-level embeddings from BERT models using MXNet and GluonNLP, targeting NLP researchers and developers who want to leverage pre-trained language representations without full end-to-end model fine-tuning. It offers a simpler way to integrate BERT's powerful contextual embeddings into existing NLP pipelines.
How It Works
The library extracts token embeddings from pre-trained BERT models. It leverages the MXNet deep learning framework and the GluonNLP toolkit for model loading and inference. Users can specify different pre-trained BERT models (e.g., bert_12_768_12
, bert_24_1024_16
) and handle Out-Of-Vocabulary (OOV) tokens using averaging, summation, or the last token's embedding.
Quick Start & Requirements
pip install bert-embedding
mxnet-cu92
(or compatible MXNet GPU version).BertEmbedding
and call it with a list of sentences. GPU usage requires setting the MXNet context.Highlighted Details
Maintenance & Community
The project is marked as deprecated by the author due to lack of maintenance time. The author is open to contributions from interested maintainers.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility with commercial or closed-source projects is not specified.
Limitations & Caveats
The project is deprecated and no longer actively maintained. The specific MXNet GPU version (mxnet-cu92
) might be outdated, potentially requiring manual dependency management for newer CUDA versions.
5 years ago
Inactive