adapter-bert  by google-research

Adapter-tuning code for BERT research paper

created 6 years ago
498 stars

Top 63.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a version of BERT enhanced with adapters, a parameter-efficient transfer learning technique for Natural Language Processing. It enables training models for new tasks by modifying only a small subset of parameters per task, achieving performance comparable to full fine-tuning while maintaining compact, shared models. The primary audience is NLP researchers and practitioners seeking efficient model adaptation.

How It Works

The core innovation is the integration of "adapters" into the BERT architecture. These adapters are small, task-specific modules inserted between the layers of the pre-trained BERT model. During training for a new task, only the parameters within these adapter modules are updated, while the vast majority of BERT's parameters remain frozen. This approach significantly reduces the number of trainable parameters, leading to faster training, smaller model checkpoints per task, and reduced catastrophic forgetting.

Quick Start & Requirements

  • Install/Run: Forked from the original BERT repo; follow its setup.
  • Prerequisites: GPU with at least 12GB RAM or Cloud TPU. Requires pre-trained BERT checkpoint and GLUE dataset.
  • Details: See https://github.com/google-research/bert for checkpoint and GLUE task instructions.
  • Example Command:
    export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
    export GLUE_DIR=/path/to/glue
    python run_classifier.py \
      --task_name=MRPC \
      --do_train=true \
      --do_eval=true \
      --data_dir=$GLUE_DIR/MRPC \
      --vocab_file=$BERT_BASE_DIR/vocab.txt \
      --bert_config_file=$BERT_BASE_DIR/bert_config.json \
      --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
      --max_seq_length=128 \
      --train_batch_size=32 \
      --learning_rate=3e-4 \
      --num_train_epochs=5.0 \
      --output_dir=/tmp/adapter_bert_mrpc/
    

Highlighted Details

  • Achieves performance similar to full fine-tuning on GLUE tasks.
  • Significantly reduces the number of trainable parameters per task.
  • Enables compact models that share parameters across multiple tasks.
  • Code is a fork of the original Google BERT repository.

Maintenance & Community

Licensing & Compatibility

  • License: Not explicitly stated in the README. The original BERT repo is Apache 2.0.
  • Compatibility: Assumed compatible with the original BERT repo's licensing.

Limitations & Caveats

The README does not specify the license for this particular fork. Users should verify licensing compatibility for commercial or closed-source applications. The example provided is for the MRPC task, and support for additional tasks may require referencing the original BERT repository.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.