End-to-end recipes for BERT pre-training and fine-tuning
Top 73.7% on sourcepulse
This repository provides end-to-end recipes for pre-training and fine-tuning BERT models using Azure Machine Learning Service. It targets NLP researchers and engineers who need to build custom language representation models on domain-specific data or adapt existing BERT models for specialized tasks, offering a stable and predictable workflow for large-scale distributed training.
How It Works
The pre-training recipe leverages PyTorch and Hugging Face's BERT v0.6.2, incorporating optimization techniques like gradient accumulation and mixed-precision training to handle large models and datasets efficiently. The fine-tuning recipe demonstrates adapting pre-trained checkpoints to downstream tasks, specifically evaluating against the GLUE benchmark using Azure ML. This approach aims to simplify the complex process of distributed training and model configuration for large language models.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
This repository is from Microsoft. A note from 7/7/2020 indicates a more recent and significantly faster implementation for BERT pretraining is available using ONNX Runtime.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README text. Compatibility for commercial use or closed-source linking would depend on the specific license chosen for the repository.
Limitations & Caveats
The README explicitly points to a more recent, significantly faster implementation using ONNX Runtime for BERT pretraining, suggesting this repository's pretraining recipe may be outdated or less performant. The setup requires an Azure ML service account and potentially substantial GPU resources.
2 years ago
Inactive