Discover and explore top open-source AI tools and projects—updated daily.
AI-Northstar-TechUniversal vector dataset tooling
Top 96.9% on SourcePulse
This library provides a universal interface for vector datasets, enabling seamless export, import, and re-embedding across various vector databases and RAG platforms. It targets developers and researchers working with large-scale vector data, offering a standardized format (VDF) to abstract away database-specific complexities and facilitate data migration and model experimentation.
How It Works
The core of vector-io is the Universal Vector Dataset Format (VDF), a standardized structure comprising a VDF_META.json file and associated Parquet files. This format decouples data from specific vector databases, allowing for agnostic operations. The library provides CLI tools (export_vdf, import_vdf, reembed_vdf) that leverage this format to translate data between different vector stores and to re-generate embeddings using specified models.
Quick Start & Requirements
pip install vdf-ioHighlighted Details
reembed_vdf utility to change embedding models without altering the vector store.model_name, dimensions, and metric for comprehensive dataset description.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
6 days ago
1 day
enjalot
tensorchord
NeumTry
dgarnitz
Mintplex-Labs
lancedb