PERT by ymcui

Pre-training method for BERT using a permuted language model

Created 4 years ago

366 stars

Top 77.0% on SourcePulse

Project Summary

PERT (Pre-training BERT with Permuted Language Model) is a self-supervised pre-training approach for Natural Language Understanding (NLU) models that aims to learn text semantics without using explicit mask tokens. It targets researchers and practitioners in NLP, offering improved performance on certain tasks by leveraging a permuted language modeling objective.

How It Works

PERT introduces a novel pre-training strategy by permuting the word order of the input text. Instead of masking tokens like BERT, PERT's objective is to predict the original position of each token within the permuted sequence. This approach allows the model to learn semantic information from shuffled text, potentially capturing richer contextual understanding without the need for artificial mask tokens.

Quick Start & Requirements

Install/Run: Models are available via 🤗 Transformers. Use from transformers import BertTokenizer, BertModel with MODEL_NAME set to one of the provided Hugging Face repository names (e.g., hfl/chinese-pert-base).
Prerequisites: Python, 🤗 Transformers library. TensorFlow 1.15 weights are also available for direct download.
Resources: Base models are ~0.4GB, Large models are ~1.2GB.
Links: Hugging Face Models, Technical Report

Highlighted Details

Offers both Chinese and English versions in base and large sizes.
Achieves performance improvements on some NLU tasks, particularly reading comprehension and sequence labeling.
Provides specialized MRC (Machine Reading Comprehension) fine-tuned versions.
The core architecture remains compatible with the standard BERT structure.

Maintenance & Community

Developed by Harbin Institute of Technology (HIT) & iFlytek Joint Lab (HFL).
Recent updates include Chinese LLaMA & Alpaca models and the LERT model.
Issue tracking is managed via GitHub Issues.

Licensing & Compatibility

The README does not explicitly state a license. However, the project is hosted on GitHub under the ymcui organization, implying a typical open-source license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

PERT shows weaker performance on certain tasks, such as text classification, compared to other methods. Users are advised to test its effectiveness on their specific downstream tasks. The technical report was pending finalization at the time of the README's last update.

PERT by ymcui

Explore Similar Projects

ConvBert by yitu-opensource

bert-japanese by cl-tohoku

nlp_notes by YangBin1729

Unilm by YunwenTechnology

pytorchic-bert by dhlee347

BERT-keras by Separius

NLP-BERT--ChineseVersion by Y1ran

NLP-Tutorials by MorvanZhou

awesome-bert by Jiakui

BertSum by nlpyang

Chinese-BERT-wwm by ymcui

bert by google-research