Discover and explore top open-source AI tools and projects—updated daily.
QwenLMState-of-the-art multimodal embedding and reranking for information retrieval
New!
Top 66.2% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Qwen3-VL-Embedding and Qwen3-VL-Reranker provide state-of-the-art multimodal embedding and reranking, built on Qwen3-VL. They enable advanced information retrieval and cross-modal understanding by processing text, images, screenshots, and videos within a unified framework. Offering a shared representation space and precise reranking, these models enhance retrieval accuracy across over 30 languages.
How It Works
The Embedding model uses a dual-tower architecture to map diverse inputs into a high-dimensional semantic vector, suitable for efficient, large-scale retrieval. The Reranking model employs a single-tower architecture with Cross-Attention for deep inter-modal fusion, precisely scoring relevance for query-document pairs to refine initial recall. This tandem approach optimizes both recall and precision.
Quick Start & Requirements
Installation involves cloning the repository and running scripts/setup_environment.sh for dependency setup. Models are available on Hugging Face and ModelScope. Usage examples cover standard Transformers and vLLM integration. Specific hardware requirements (GPU, VRAM) are not detailed, though model sizes (2B, 8B) suggest significant needs. Flash Attention 2 is recommended for acceleration.
Highlighted Details
Maintenance & Community
No specific community channels or detailed maintenance information are provided. The project appears research-driven, with authors listed in the citation.
Licensing & Compatibility
The README does not specify the software license. This lack of clarity is a significant adoption blocker, leaving terms for commercial use or integration with closed-source projects undefined.
Limitations & Caveats
Detailed Transformers usage examples for the Reranker are marked "Coming soon." Specific hardware requirements and comprehensive benchmarks beyond provided tables are not elaborated. The absence of a specified license is a critical caveat.
1 day ago
Inactive
kohjingyu
lyuchenyang
rom1504
FlagOpen
salesforce