Research paper code for contrastive self-supervised sentence representation transfer
Top 59.5% on sourcepulse
ConSERT is a contrastive learning framework designed to improve sentence representations derived from pre-trained language models like BERT. It addresses the issue of representation collapse, which hinders performance on Semantic Textual Similarity (STS) tasks, by fine-tuning models using unlabeled text. This framework is beneficial for researchers and practitioners in NLP seeking robust and transferable sentence embeddings.
How It Works
ConSERT employs a contrastive learning objective to fine-tune BERT models. It leverages unlabeled text data to push semantically similar sentences closer in the embedding space and dissimilar sentences further apart. The sentence representations are generated by averaging token embeddings from the last two layers of the BERT model. This approach effectively mitigates the collapse problem, leading to improved performance on downstream tasks, particularly STS.
Quick Start & Requirements
torch==1.6.0
, cudatoolkit==10.0.103
, cudnn==7.6.5
, sentence-transformers==0.3.9
, transformers==3.4.0
, apex==0.1.0
. Apex needs to be cloned and installed separately.bert-base-uncased
) and STS datasets (English and Chinese) using provided scripts../scripts
directory (e.g., bash scripts/unsup-consert-base.sh
).max_seq_length
).Highlighted Details
Maintenance & Community
The project is associated with the ACL 2021 paper "ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer." No specific community channels or active maintenance indicators are mentioned in the README.
Licensing & Compatibility
The repository does not explicitly state a license. The code is provided for research purposes related to the ACL 2021 paper. Commercial use or linking with closed-source projects would require clarification on licensing.
Limitations & Caveats
The provided results for large models may differ slightly from the paper due to updated PyTorch/CUDA versions and adjusted max_seq_length
. The project's dependencies are specific (e.g., older PyTorch and CUDA versions), which might pose compatibility challenges with newer environments.
3 years ago
Inactive