UER-py  by dbiir

PyTorch toolkit for pre-training and fine-tuning NLP models

created 6 years ago
3,072 stars

Top 15.9% on sourcepulse

GitHubView on GitHub
Project Summary

UER-py is an open-source PyTorch framework for pre-training and fine-tuning Natural Language Processing (NLP) models. It targets researchers and practitioners seeking to leverage or extend universal encoder representations, offering modularity and a model zoo for various NLP tasks. The framework aims to reproduce state-of-the-art (SOTA) results and facilitate custom model development.

How It Works

UER-py employs a modular architecture, separating models into components like embeddings, encoders, and targets. This design allows users to combine different modules to construct custom pre-training models with flexibility. It supports various pre-training objectives and architectures, including BERT, GPT-2, ELMo, and T5, and facilitates distributed training across multiple GPUs.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python >= 3.6, PyTorch >= 1.1, six >= 1.12.0. Additional dependencies for specific features include TensorFlow, SentencePiece, LightGBM, BayesianOptimization, jieba, and pytorch-crf.
  • Setup: Pre-processing can be time-consuming. Distributed training requires multiple GPUs.
  • Docs: UER-py Project Wiki

Highlighted Details

  • Reproduces performance of models like BERT, GPT-2, ELMo, and T5.
  • Supports CPU, single GPU, and distributed training modes.
  • Offers a model zoo with pre-trained models of different properties.
  • Provides winning solutions for NLP competitions like CLUE.

Maintenance & Community

The project is associated with Tencent and has multiple academic and industry contributors. A refactored, newer version, TencentPretrain, is available and supports multi-modal models and larger model training.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project recommends TencentPretrain for multi-modal or large model training, suggesting UER-py is primarily suited for text models under one billion parameters. The README does not specify a license, which could impact commercial adoption.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Abhishek Thakur Abhishek Thakur(World's First 4x Kaggle GrandMaster), and
5 more.

xlnet by zihangdai

0.0%
6k
Language model research paper using generalized autoregressive pretraining
created 6 years ago
updated 2 years ago
Feedback? Help us improve.