ConvBert by yitu-opensource

ConvBERT: research paper improving BERT via dynamic convolution

Created 5 years ago

254 stars

Top 99.1% on SourcePulse

Project Summary

ConvBERT introduces a novel architecture for pre-trained language models, enhancing BERT with span-based dynamic convolution. This approach aims to improve performance and efficiency for natural language processing tasks, targeting researchers and practitioners in the field.

How It Works

ConvBERT integrates dynamic convolutions within its architecture, allowing for adaptive kernel generation based on input features. This differs from standard self-attention mechanisms by employing convolutional operations that can dynamically adjust their receptive fields and weights, potentially leading to more efficient and effective feature extraction for language understanding.

Quick Start & Requirements

Install: Clone the repository and install dependencies: pip install tensorflow==1.15 numpy scikit-learn.
Prerequisites: TensorFlow 1.15, NumPy, scikit-learn. Tested on a V100 GPU.
Pre-training: Requires downloading the OpenWebText corpus (12GB), processing it (approx. 30GB disk space), and running bash build_data.sh followed by bash pretrain.sh.
Fine-tuning: Download GLUE data and run bash finetune.sh. A Google Colab notebook is available for a quick example.
Pre-trained Model: Available for download.

Highlighted Details

Introduces span-based dynamic convolution for improved language model pre-training.
Achieves competitive performance on GLUE benchmarks.
Codebase is based on ELECTRA.

Maintenance & Community

No specific information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The code is presented for research purposes.

Limitations & Caveats

The project requires TensorFlow 1.15, which is an older version and may present compatibility challenges with modern TensorFlow ecosystems. Pre-training requires significant data and disk space.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days