Discover and explore top open-source AI tools and projects—updated daily.
netease-youdaoOpen-source embedding/reranker models for RAG
Top 23.2% on SourcePulse
BCEmbedding provides open-source bilingual and crosslingual embedding and reranker models specifically designed for Retrieval Augmented Generation (RAG) applications. Developed by Netease Youdao, it targets developers and researchers building RAG systems that require robust performance across Chinese and English languages, offering a two-stage retrieval solution.
How It Works
BCEmbedding utilizes a two-stage retrieval process. The EmbeddingModel acts as a dual-encoder for efficient first-stage retrieval, generating semantic vectors. The RerankerModel then employs a cross-encoder for a second-stage refinement, re-ranking retrieved passages for enhanced precision and relevance. This approach leverages Youdao's translation engine for strong bilingual and crosslingual capabilities, aiming for out-of-the-box usability without fine-tuning.
Quick Start & Requirements
pip install BCEmbedding or install from source.transformers, sentence-transformers, langchain, llama-index (for integrations). GPU recommended for optimal performance.Highlighted Details
RerankerModel supports long passages (up to 32k tokens) and provides meaningful relevance scores.EmbeddingModel, simplifying integration.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
EmbeddingModel is noted as "coming soon."4 months ago
1 day
dleemiller
devflowinc
FlagOpen
huggingface