adapter-bert by google-research

Adapter-tuning code for BERT research paper

Created 6 years ago

502 stars

Top 62.0% on SourcePulse

Project Summary

This repository provides a version of BERT enhanced with adapters, a parameter-efficient transfer learning technique for Natural Language Processing. It enables training models for new tasks by modifying only a small subset of parameters per task, achieving performance comparable to full fine-tuning while maintaining compact, shared models. The primary audience is NLP researchers and practitioners seeking efficient model adaptation.

How It Works

The core innovation is the integration of "adapters" into the BERT architecture. These adapters are small, task-specific modules inserted between the layers of the pre-trained BERT model. During training for a new task, only the parameters within these adapter modules are updated, while the vast majority of BERT's parameters remain frozen. This approach significantly reduces the number of trainable parameters, leading to faster training, smaller model checkpoints per task, and reduced catastrophic forgetting.

Quick Start & Requirements

Install/Run: Forked from the original BERT repo; follow its setup.
Prerequisites: GPU with at least 12GB RAM or Cloud TPU. Requires pre-trained BERT checkpoint and GLUE dataset.
Details: See https://github.com/google-research/bert for checkpoint and GLUE task instructions.

Example Command:

export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue
python run_classifier.py \
  --task_name=MRPC \
  --do_train=true \
  --do_eval=true \
  --data_dir=$GLUE_DIR/MRPC \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --train_batch_size=32 \
  --learning_rate=3e-4 \
  --num_train_epochs=5.0 \
  --output_dir=/tmp/adapter_bert_mrpc/

Highlighted Details

Achieves performance similar to full fine-tuning on GLUE tasks.
Significantly reduces the number of trainable parameters per task.
Enables compact models that share parameters across multiple tasks.
Code is a fork of the original Google BERT repository.

Maintenance & Community

Contact: Neil Houlsby (neilhoulsby@google.com).
This is not an official Google product.

Licensing & Compatibility

License: Not explicitly stated in the README. The original BERT repo is Apache 2.0.
Compatibility: Assumed compatible with the original BERT repo's licensing.

Limitations & Caveats

The README does not specify the license for this particular fork. Users should verify licensing compatibility for commercial or closed-source applications. The example provided is for the MRPC task, and support for additional tasks may require referencing the original BERT repository.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days