ERNIE  by PaddlePaddle

PaddlePaddle implementations for ERNIE family pre-training models

created 6 years ago
7,381 stars

Top 7.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides official implementations for the ERNIE family of pre-trained models, developed by Baidu. It targets researchers and developers working on Natural Language Processing (NLP) and multimodal understanding and generation tasks, offering a comprehensive suite for building and deploying advanced language models.

How It Works

ERNIE models are knowledge-enhanced large language models that integrate external knowledge into pre-training. This approach aims to improve language understanding and generation capabilities by explicitly modeling relationships between entities and concepts. The framework supports both dynamic and static graph training, allowing for flexibility in model development and deployment.

Quick Start & Requirements

  • Install: git clone https://github.com/PaddlePaddle/ERNIE.git
  • Prerequisites: PaddlePaddle framework. Specific model versions may have additional requirements.
  • Setup: Download pre-trained models (e.g., sh download_ernie_3.0_base_ch.sh) and configure JSON files for training/inference.
  • Docs: ERNIE Model Introduction

Highlighted Details

  • Supports a wide range of NLP tasks including text classification, sequence labeling, information extraction, and text generation.
  • Includes multimodal models like ERNIE-ViL for vision-language understanding.
  • Offers data preprocessing tools for cleaning, augmentation, and format conversion.
  • Achieved state-of-the-art results on benchmarks like GLUE and SemEval.

Maintenance & Community

The project is actively maintained by Baidu and has seen contributions from numerous researchers. Information on roadmaps and community channels is available within the repository.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README, but it is associated with the Apache 2.0 licensed PaddlePaddle framework. Compatibility for commercial use should be verified.

Limitations & Caveats

Older versions of ERNIE code have been migrated to a repro branch, indicating potential breaking changes or a shift in the primary development focus. The README mentions using "newly upgraded dynamic-static combined ERNIE suite," suggesting that users should be aware of potential differences between versions.

Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
90
Issues (30d)
19
Star History
1,045 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
4 more.

open_flamingo by mlfoundations

0.1%
4k
Open-source framework for training large multimodal models
created 2 years ago
updated 11 months ago
Feedback? Help us improve.