bert4keras  by bojone

Keras library for Transformer models, aiming for clarity

Created 6 years ago
5,418 stars

Top 9.3% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides a clean and lightweight Keras implementation of Transformer models, primarily BERT, designed for ease of modification and customization. It targets researchers and developers needing to fine-tune, pre-train, or experiment with Transformer architectures within the Keras/TensorFlow ecosystem. The key benefit is a simplified codebase that supports loading various pre-trained weights and offers extensive examples for common NLP tasks.

How It Works

The project reimplements Transformer models in Keras, focusing on a clear and modular structure. It supports loading pre-trained weights from popular models like BERT, RoBERTa, ALBERT, and T5, facilitating transfer learning. The library handles essential components like attention masks and provides utilities for pre-training from scratch, including multi-GPU and TPU support. This approach aims to reduce dependencies and improve maintainability compared to more heavily encapsulated libraries.

Quick Start & Requirements

  • Install stable version: pip install bert4keras
  • Install latest version: pip install git+https://www.github.com/bojone/bert4keras.git
  • Recommended environment: TensorFlow 1.14 + Keras 2.3.1.
  • TensorFlow 2.x is supported, requiring TF_KERAS=1 environment variable.
  • Keras 2.4+ is compatible but functionally equivalent to tf.keras.
  • See examples directory for usage.

Highlighted Details

  • Supports loading pre-trained weights for BERT, RoBERTa, ALBERT, T5, ELECTRA, GPT2, and more.
  • Includes code for pre-training models from scratch, with TPU and multi-GPU support.
  • Offers utilities for sequence-to-sequence tasks, including auto-title generation.
  • Features like hierarchical position embeddings for handling long texts and residual attention scores are available.

Maintenance & Community

The project is actively maintained by Jianlin Su, with contributions welcomed. The author's blog is at https://kexue.fm/, and online documentation is available at http://bert4keras.spaces.ac.cn/. A PyTorch-based alternative, bert4torch, is also mentioned.

Licensing & Compatibility

The library is available under an unspecified license. The README does not explicitly state licensing terms, which may require clarification for commercial use or closed-source integration.

Limitations & Caveats

Versions 0.2.4 and later only support Google's version of ALBERT weights or brightmart's ALBERT weights that explicitly mention "Google". Older brightmart ALBERT weights require version 0.2.3 or the author's converted albert_zh. Support for Keras versions prior to 2.3.0 was dropped in late 2019.

Health Check
Last Commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Omar Khattab Omar Khattab(Coauthor of DSPy, ColBERT; Professor at MIT), and
15 more.

gpt-neo by EleutherAI

0.0%
8k
GPT-2/3-style model implementation using mesh-tensorflow
Created 5 years ago
Updated 3 years ago
Feedback? Help us improve.