Adapter-tuning code for BERT research paper
Top 63.2% on sourcepulse
This repository provides a version of BERT enhanced with adapters, a parameter-efficient transfer learning technique for Natural Language Processing. It enables training models for new tasks by modifying only a small subset of parameters per task, achieving performance comparable to full fine-tuning while maintaining compact, shared models. The primary audience is NLP researchers and practitioners seeking efficient model adaptation.
How It Works
The core innovation is the integration of "adapters" into the BERT architecture. These adapters are small, task-specific modules inserted between the layers of the pre-trained BERT model. During training for a new task, only the parameters within these adapter modules are updated, while the vast majority of BERT's parameters remain frozen. This approach significantly reduces the number of trainable parameters, leading to faster training, smaller model checkpoints per task, and reduced catastrophic forgetting.
Quick Start & Requirements
export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue
python run_classifier.py \
--task_name=MRPC \
--do_train=true \
--do_eval=true \
--data_dir=$GLUE_DIR/MRPC \
--vocab_file=$BERT_BASE_DIR/vocab.txt \
--bert_config_file=$BERT_BASE_DIR/bert_config.json \
--init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
--max_seq_length=128 \
--train_batch_size=32 \
--learning_rate=3e-4 \
--num_train_epochs=5.0 \
--output_dir=/tmp/adapter_bert_mrpc/
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README does not specify the license for this particular fork. Users should verify licensing compatibility for commercial or closed-source applications. The example provided is for the MRPC task, and support for additional tasks may require referencing the original BERT repository.
1 year ago
Inactive