pytorchic-bert by dhlee347

Pytorch re-implementation of Google BERT

Created 7 years ago

598 stars

Top 54.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Thomas Wolf

Cofounder of Hugging Face

Project Summary

This repository provides a PyTorch implementation of Google's BERT model, aiming for a more "pythonic" and "pytorchic" code style with fewer lines of code than existing implementations. It is suitable for researchers and practitioners familiar with PyTorch who want to fine-tune or pre-train BERT models for NLP tasks.

How It Works

The project implements BERT using core PyTorch modules, inspired by Hugging Face's codebase but refactored for clarity and conciseness. It includes modules for tokenization (adapted from original BERT), loading TensorFlow checkpoints, defining transformer model architectures, a custom BertAdam optimizer, and utilities for training and evaluation. The structure facilitates both pre-training on custom corpora and fine-tuning on downstream tasks like GLUE benchmarks.

Quick Start & Requirements

Install: pip install fire tqdm tensorboardx
Prerequisites: Python >= 3.6, TensorFlow (for checkpoint loading), CUDA (for GPU acceleration).
Usage: Requires downloading pre-trained BERT checkpoints and GLUE benchmark datasets. Example commands are provided for fine-tuning (MRPC task) and pre-training.
Links: No explicit links to official docs or demos are provided in the README.

Highlighted Details

Achieves 84.3% accuracy on the MRPC task, comparable to Google's reported 84.5%.
Includes scripts for both fine-tuning and pre-training BERT.
Supports loading checkpoints from TensorFlow.
Demonstrates training curves for Masked LM and Next Sentence Prediction losses.

Maintenance & Community

The repository is maintained by dhlee347. No information on community channels, roadmap, or notable contributors is present in the README.

Licensing & Compatibility

The README does not explicitly state a license. Given the inspiration from Hugging Face and the nature of the project, it is likely intended for research and non-commercial use, but explicit licensing terms are absent.

Limitations & Caveats

The README states the code is "not so heavily tested" and encourages users to report bugs. It relies on TensorFlow for initial checkpoint loading, which might be an inconvenience for pure PyTorch workflows. The absence of explicit licensing could pose compatibility issues for commercial applications.

Health Check

Last Commit

5 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days