PERT  by ymcui

Pre-training method for BERT using a permuted language model

created 3 years ago
364 stars

Top 78.4% on sourcepulse

GitHubView on GitHub
Project Summary

PERT (Pre-training BERT with Permuted Language Model) is a self-supervised pre-training approach for Natural Language Understanding (NLU) models that aims to learn text semantics without using explicit mask tokens. It targets researchers and practitioners in NLP, offering improved performance on certain tasks by leveraging a permuted language modeling objective.

How It Works

PERT introduces a novel pre-training strategy by permuting the word order of the input text. Instead of masking tokens like BERT, PERT's objective is to predict the original position of each token within the permuted sequence. This approach allows the model to learn semantic information from shuffled text, potentially capturing richer contextual understanding without the need for artificial mask tokens.

Quick Start & Requirements

  • Install/Run: Models are available via 🤗 Transformers. Use from transformers import BertTokenizer, BertModel with MODEL_NAME set to one of the provided Hugging Face repository names (e.g., hfl/chinese-pert-base).
  • Prerequisites: Python, 🤗 Transformers library. TensorFlow 1.15 weights are also available for direct download.
  • Resources: Base models are ~0.4GB, Large models are ~1.2GB.
  • Links: Hugging Face Models, Technical Report

Highlighted Details

  • Offers both Chinese and English versions in base and large sizes.
  • Achieves performance improvements on some NLU tasks, particularly reading comprehension and sequence labeling.
  • Provides specialized MRC (Machine Reading Comprehension) fine-tuned versions.
  • The core architecture remains compatible with the standard BERT structure.

Maintenance & Community

  • Developed by Harbin Institute of Technology (HIT) & iFlytek Joint Lab (HFL).
  • Recent updates include Chinese LLaMA & Alpaca models and the LERT model.
  • Issue tracking is managed via GitHub Issues.

Licensing & Compatibility

  • The README does not explicitly state a license. However, the project is hosted on GitHub under the ymcui organization, implying a typical open-source license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

PERT shows weaker performance on certain tasks, such as text classification, compared to other methods. Users are advised to test its effectiveness on their specific downstream tasks. The technical report was pending finalization at the time of the README's last update.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.