Discover and explore top open-source AI tools and projects—updated daily.
LisennlpTinyBERT: Distilled pre-trained language model based on BERT
Top 95.6% on SourcePulse
This repository provides a simplified implementation of TinyBERT, a knowledge distillation framework for pre-trained language models. It aims to make the distillation process more accessible for users to train their own distilled models using custom datasets.
How It Works
TinyBERT employs a multi-stage knowledge distillation process. It first distills a general-purpose student BERT model from a teacher BERT. Subsequently, it fine-tunes the teacher BERT on task-specific data and then distills this fine-tuned teacher into a task-specific student model. This involves minimizing losses across word embeddings, hidden layers, and attention mechanisms, with a final stage focusing on task prediction labels. Data augmentation techniques are also incorporated to improve performance.
Quick Start & Requirements
sh script/general_train.shsh script/task_train.sh onesh script/task_train.sh twodata/train.txt, data/eval.txt.python data_augmentation.py with parameters for pre-trained BERT model path, data path, GloVe path, and augmentation settings.Highlighted Details
Maintenance & Community
The project is based on Huawei's TinyBert. No specific community links or active maintenance signals are provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project primarily uses English data and GloVe embeddings, requiring manual adaptation for Chinese or other languages by changing pre-trained models and embedding files. Evaluation details are marked as "To be continued."
5 years ago
Inactive
namisan