bert-for-tf2 by kpe

Keras layer for BERT, ALBERT, and adapter-BERT implementations

Created 6 years ago

809 stars

Top 43.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chaoyu Yang

Founder of Bento

Project Summary

This repository provides a TensorFlow 2.0 Keras implementation of BERT, ALBERT, and adapter-BERT, enabling users to load original pre-trained weights and achieve numerically identical activations. It's designed for NLP researchers and engineers seeking a flexible and efficient way to integrate these transformer models into Keras workflows.

How It Works

The implementation is built from scratch using basic TensorFlow operations, mirroring google-research/bert/modeling.py with simplifications. It leverages kpe/params-flow to reduce Keras boilerplate. Support for ALBERT and adapter-BERT is achieved through configuration parameters like shared_layer and adapter_size, allowing for parameter-efficient fine-tuning by adding small adapter layers over frozen BERT weights.

Quick Start & Requirements

Install via pip: pip install bert-for-tf2
Requires TensorFlow 2.0 or newer.
Pre-trained weights can be fetched using provided utility functions.
See examples for Colab notebooks.

Highlighted Details

Supports loading original BERT, ALBERT (TFHub and non-TFHub), and brightmart/albert_zh weights.
Implements adapter-BERT for parameter-efficient transfer learning.
Provides utilities for fetching models and tokenization for various BERT/ALBERT variants.
Includes detailed examples for fine-tuning sentiment classifiers.

Maintenance & Community

Active development with recent updates in July 2020.
Links to Twitter for updates: https://twitter.com/intent/user?screen_name=siddhadev

Licensing & Compatibility

MIT License.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project's last significant update was in July 2020, indicating potential for unaddressed issues or lack of support for newer transformer architectures or TensorFlow features.

Health Check

Last Commit

3 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days