Companion code for research paper on BERT subarchitecture extraction
Top 65.3% on sourcepulse
Bort provides an optimal subarchitecture for BERT, significantly reducing its size and computational requirements. This is achieved using a fully polynomial-time approximation scheme (FPTAS) for neural architecture search, making it suitable for researchers and practitioners seeking efficient NLP models.
How It Works
Bort extracts an optimal subset of BERT's architectural parameters, resulting in a model that is 5.5% the size of BERT-large (16% of net size). This approach leverages an FPTAS to efficiently search for efficient subarchitectures, offering substantial speedups (7.9x on CPU vs. BERT-base) and reduced pre-training time (1.2% of RoBERTa-large).
Quick Start & Requirements
pip install -r requirements.txt
aws s3 cp s3://alexa-saif-bort/bort.params model/
wget https://github.com/dmlc/gluon-nlp/blob/v0.9.x/scripts/bert/sample_text.txt
Highlighted Details
Maintenance & Community
The project is associated with research papers by Adrian de Wynter and Daniel J. Perry. No specific community channels or active maintenance signals are mentioned in the README.
Licensing & Compatibility
Limitations & Caveats
Fine-tuning may yield odd results without an implementation of the Agora algorithm, which is referenced but not included. Out-of-memory errors can occur with large batch sizes or sequence lengths; reducing sequence length is recommended.
3 years ago
Inactive