UER-py by dbiir

PyTorch toolkit for pre-training and fine-tuning NLP models

Created 6 years ago

3,103 stars

Top 15.3% on SourcePulse

View on GitHub

5 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Binyuan Hui

Research Scientist at Alibaba Qwen

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Jeff Hammerbacher

Cofounder of Cloudera

and 1 more!

Project Summary

UER-py is an open-source PyTorch framework for pre-training and fine-tuning Natural Language Processing (NLP) models. It targets researchers and practitioners seeking to leverage or extend universal encoder representations, offering modularity and a model zoo for various NLP tasks. The framework aims to reproduce state-of-the-art (SOTA) results and facilitate custom model development.

How It Works

UER-py employs a modular architecture, separating models into components like embeddings, encoders, and targets. This design allows users to combine different modules to construct custom pre-training models with flexibility. It supports various pre-training objectives and architectures, including BERT, GPT-2, ELMo, and T5, and facilitates distributed training across multiple GPUs.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python >= 3.6, PyTorch >= 1.1, six >= 1.12.0. Additional dependencies for specific features include TensorFlow, SentencePiece, LightGBM, BayesianOptimization, jieba, and pytorch-crf.
Setup: Pre-processing can be time-consuming. Distributed training requires multiple GPUs.
Docs: UER-py Project Wiki

Highlighted Details

Reproduces performance of models like BERT, GPT-2, ELMo, and T5.
Supports CPU, single GPU, and distributed training modes.
Offers a model zoo with pre-trained models of different properties.
Provides winning solutions for NLP competitions like CLUE.

Maintenance & Community

The project is associated with Tencent and has multiple academic and industry contributors. A refactored, newer version, TencentPretrain, is available and supports multi-modal models and larger model training.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project recommends TencentPretrain for multi-modal or large model training, suggesting UER-py is primarily suited for text models under one billion parameters. The README does not specify a license, which could impact commercial adoption.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days