bert-nmt by bert-nmt

Training script for BERT-fused Neural Machine Translation (NMT)

Created 6 years ago

362 stars

Top 77.6% on SourcePulse

Project Summary

This repository provides code for BERT-fused Neural Machine Translation (NMT), enhancing translation quality by integrating BERT embeddings. It's targeted at researchers and practitioners in NLP and machine translation looking to leverage large pre-trained language models for improved NMT performance.

How It Works

The approach fuses BERT embeddings into a standard Transformer NMT architecture. BERT's contextual representations are incorporated, likely as initial encoder states or through attention mechanisms, allowing the NMT model to benefit from BERT's deep linguistic understanding. This fusion aims to capture richer semantic information than traditional NMT models, leading to more accurate translations.

Quick Start & Requirements

Installation: pip install --editable . after cloning the repository.
Prerequisites: PyTorch version 1.0.0/1.1.0, Python version >= 3.5. Requires Fairseq for data preprocessing and baseline NMT training.
Data: Requires tokenized and BPE-encoded data files, prepared using Fairseq's prepare-xxx.sh and a custom makedataforbert.sh script.
Resources: Training involves standard NMT resource requirements, potentially higher due to BERT integration.
Links: Fairseq for baseline NMT.

Highlighted Details

Achieved 37.34 BLEU on IWSLT'14 de->en using bert-base-german-dbmdz-uncased.
Supports fine-tuning with a pre-trained vanilla NMT model (--warmup-from-nmt).
Implements an encoder dropout technique (--encoder-bert-dropout) for regularization.
Compatible with Hugging Face's transformers library for various BERT models.

Maintenance & Community

The project is associated with the ICLR 2020 paper "Incorporating BERT into Neural Machine Translation." No specific community channels or active maintenance signals are evident from the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The code requires specific older versions of PyTorch (1.0.0/1.1.0), which may pose compatibility challenges with current environments. The data preparation steps involve custom scripts beyond standard Fairseq.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days