TinyBert by Lisennlp

TinyBERT: Distilled pre-trained language model based on BERT

Created 5 years ago

269 stars

Top 95.6% on SourcePulse

Project Summary

This repository provides a simplified implementation of TinyBERT, a knowledge distillation framework for pre-trained language models. It aims to make the distillation process more accessible for users to train their own distilled models using custom datasets.

How It Works

TinyBERT employs a multi-stage knowledge distillation process. It first distills a general-purpose student BERT model from a teacher BERT. Subsequently, it fine-tunes the teacher BERT on task-specific data and then distills this fine-tuned teacher into a task-specific student model. This involves minimizing losses across word embeddings, hidden layers, and attention mechanisms, with a final stage focusing on task prediction labels. Data augmentation techniques are also incorporated to improve performance.

Quick Start & Requirements

General Distillation: sh script/general_train.sh
Task Distillation (Stage 1): sh script/task_train.sh one
Task Distillation (Stage 2): sh script/task_train.sh two
Prerequisites: PyTorch, Hugging Face Transformers, GloVe embeddings (for data augmentation). Requires access to pre-trained BERT models.
Data Format: data/train.txt, data/eval.txt.
Data Augmentation: python data_augmentation.py with parameters for pre-trained BERT model path, data path, GloVe path, and augmentation settings.

Highlighted Details

Simplified data loading for custom datasets.
Multi-stage distillation process for general and task-specific models.
Data augmentation strategy using BERT masking and GloVe similarity.
Provides pre-distilled models for GLUE tasks.

Maintenance & Community

The project is based on Huawei's TinyBert. No specific community links or active maintenance signals are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project primarily uses English data and GloVe embeddings, requiring manual adaptation for Chinese or other languages by changing pre-trained models and embedding files. Evaluation details are marked as "To be continued."

TinyBert by Lisennlp

Explore Similar Projects

MiniRBT by iflytek

PERT by ymcui

kb by allenai

adapter-bert by google-research

ccf_2020_qa_match by xv44586

FastBERT by autoliuweijie

BERT4doc-Classification by xuyige

AzureML-BERT by microsoft

dllm by ZHZisZZ

EasyTransfer by alibaba

awesome-bert by Jiakui

mt-dnn by namisan