Pre-training method for BERT using a permuted language model
Top 78.4% on sourcepulse
PERT (Pre-training BERT with Permuted Language Model) is a self-supervised pre-training approach for Natural Language Understanding (NLU) models that aims to learn text semantics without using explicit mask tokens. It targets researchers and practitioners in NLP, offering improved performance on certain tasks by leveraging a permuted language modeling objective.
How It Works
PERT introduces a novel pre-training strategy by permuting the word order of the input text. Instead of masking tokens like BERT, PERT's objective is to predict the original position of each token within the permuted sequence. This approach allows the model to learn semantic information from shuffled text, potentially capturing richer contextual understanding without the need for artificial mask tokens.
Quick Start & Requirements
from transformers import BertTokenizer, BertModel
with MODEL_NAME
set to one of the provided Hugging Face repository names (e.g., hfl/chinese-pert-base
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
PERT shows weaker performance on certain tasks, such as text classification, compared to other methods. Users are advised to test its effectiveness on their specific downstream tasks. The technical report was pending finalization at the time of the README's last update.
2 weeks ago
1 day