Chinese NER fine-tuning using ALBERT pre-trained language model
Top 65.0% on sourcepulse
This project provides a fine-tuned ALBERT model for Named Entity Recognition (NER) on Chinese text, aiming to offer comparable or superior performance to BERT with significantly reduced model size and training time. It is intended for researchers and practitioners working with Chinese NLP tasks.
How It Works
The project leverages the ALBERT architecture, specifically a Chinese pre-trained ALBERT base model, and fine-tunes it on NER tasks. ALBERT's parameter reduction techniques, such as parameter sharing and factorized embedding parameterization, contribute to its smaller footprint and faster training compared to BERT, while aiming to maintain high accuracy.
Quick Start & Requirements
python albert_ner.py --task_name ner --do_train true --do_eval true --data_dir data --vocab_file ./albert_config/vocab.txt --bert_config_file ./albert_base_zh/albert_config_base.json --max_seq_length 128 --train_batch_size 64 --learning_rate 2e-5 --num_train_epochs 3 --output_dir albert_base_ner_checkpoints
Highlighted Details
Maintenance & Community
No specific community channels, roadmap, or contributor information is provided in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility with TensorFlow 2.0 is explicitly stated as not supported.
Limitations & Caveats
The project is explicitly stated to not support TensorFlow 2.0. The README does not detail the specific dataset used for fine-tuning or provide extensive documentation beyond the execution command.
4 years ago
Inactive