Discover and explore top open-source AI tools and projects—updated daily.
TinyBERT: Distilled pre-trained language model based on BERT
Top 95.9% on SourcePulse
This repository provides a simplified implementation of TinyBERT, a knowledge distillation framework for pre-trained language models. It aims to make the distillation process more accessible for users to train their own distilled models using custom datasets.
How It Works
TinyBERT employs a multi-stage knowledge distillation process. It first distills a general-purpose student BERT model from a teacher BERT. Subsequently, it fine-tunes the teacher BERT on task-specific data and then distills this fine-tuned teacher into a task-specific student model. This involves minimizing losses across word embeddings, hidden layers, and attention mechanisms, with a final stage focusing on task prediction labels. Data augmentation techniques are also incorporated to improve performance.
Quick Start & Requirements
sh script/general_train.sh
sh script/task_train.sh one
sh script/task_train.sh two
data/train.txt
, data/eval.txt
.python data_augmentation.py
with parameters for pre-trained BERT model path, data path, GloVe path, and augmentation settings.Highlighted Details
Maintenance & Community
The project is based on Huawei's TinyBert. No specific community links or active maintenance signals are provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project primarily uses English data and GloVe embeddings, requiring manual adaptation for Chinese or other languages by changing pre-trained models and embedding files. Evaluation details are marked as "To be continued."
4 years ago
Inactive