Transformer model for language understanding with knowledge-based embeddings
Top 48.6% on sourcepulse
LUKE (Language Understanding with Knowledge-based Embeddings) is a transformer-based language model that incorporates knowledge-based entity representations. It targets NLP researchers and practitioners seeking state-of-the-art performance on tasks like named entity recognition, relation classification, and question answering, offering improved contextual understanding through entity-aware self-attention.
How It Works
LUKE enhances transformer models by integrating entity information directly into the self-attention mechanism. It uses entity-aware self-attention, allowing the model to attend to specific entities within the text. This approach, detailed in their EMNLP 2020 paper, enables LUKE to capture richer contextual representations that benefit downstream NLP tasks.
Quick Start & Requirements
poetry install
(with optional extras for pretraining: pretraining
, pretraining opennlp
, pretraining icu
).poetry run pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
for CUDA 11.3).Highlighted Details
Maintenance & Community
The project is actively maintained, with recent updates in late 2022 adding Japanese models and fine-tuning code. LUKE is integrated into the Hugging Face Transformers library, indicating strong community adoption and support.
Licensing & Compatibility
The repository does not explicitly state a license. However, its integration with Hugging Face Transformers suggests it is intended for broad use, including research and potentially commercial applications, but users should verify licensing details.
Limitations & Caveats
The primary installation method uses Poetry, which might be an unfamiliar dependency for some users. Manual PyTorch installation is often required to match specific hardware configurations. The README does not specify a license, which could be a concern for commercial use.
1 year ago
Inactive