Discover and explore top open-source AI tools and projects—updated daily.
CLI tools for semantic search and document parsing
New!
Top 38.6% on SourcePulse
This project provides high-performance command-line tools for document processing and semantic search, built with Rust. It's designed for developers and power users who need efficient, local, or cloud-assisted text analysis and retrieval capabilities, offering a Unix-friendly interface for seamless integration into existing workflows.
How It Works
The parse
tool leverages the LlamaParse API (or other backends) to convert various document formats (PDF, DOCX, etc.) into markdown, with features like caching and concurrent processing for speed. The search
tool performs local, fast semantic keyword searches using model2vec embeddings and cosine similarity, offering per-line context matching and configurable distance thresholds without requiring a separate vector database.
Quick Start & Requirements
cargo install semtools
(or --features=parse
or --features=search
for specific tools).parse
tool: Requires a LlamaIndex Cloud API key, configurable via ~/.parse_config.json
or the LLAMA_CLOUD_API_KEY
environment variable.Highlighted Details
Maintenance & Community
model2vec-rs
and simsimd
.Licensing & Compatibility
Limitations & Caveats
The parse
tool defaults to the LlamaParse API, which requires an API key and internet connectivity. Future work includes adding local-only parsing backends.
1 day ago
Inactive