Research paper code for dense retrieval pre-training
Top 99.8% on sourcepulse
This repository provides the code and pre-trained models for Condenser, a family of Transformer architectures designed for efficient dense retrieval. It targets researchers and practitioners in Natural Language Processing (NLP) and Information Retrieval (IR) looking to improve the performance of dense passage retrieval systems. The primary benefit is enhanced retrieval accuracy through specialized pre-training objectives.
How It Works
Condenser introduces novel pre-training architectures that optimize Transformer models for dense retrieval tasks. It modifies the standard Transformer by incorporating specific architectural choices and pre-training objectives, such as "late MLM" and "skip-from" layers, to better capture the nuances required for effective passage representation. This approach aims to improve retrieval performance compared to standard BERT or RoBERTa models fine-tuned for retrieval.
Quick Start & Requirements
Luyu/condenser
, Luyu/co-condenser-wiki
, Luyu/co-condenser-marco
).transformers.AutoModel.from_pretrained()
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README mentions that using a randomly initialized head with pre-trained weights can corrupt the model, emphasizing the importance of using provided head weights for further pre-training. Effective contrastive pre-training requires a large effective batch size, potentially necessitating the use of gradient caching techniques if GPU memory is limited.
3 years ago
Inactive