Discover and explore top open-source AI tools and projects—updated daily.
SouravRoy-ETLEmbedded SQL database for direct file querying
Top 43.4% on SourcePulse
Summary
SlothDB is an embedded, file-first analytical SQL database designed for high performance across diverse environments, from local development to server deployments and web browsers. It enables users to query data directly from various file formats (CSV, Parquet, JSON, Avro, Excel, Arrow, SQLite) without requiring a separate import step or server process, offering significant speed advantages for analytical workloads on single machines.
How It Works
Built from scratch in C++20, SlothDB is a vectorized, columnar engine optimized for analytics. Its file-first architecture allows direct SQL querying of local or remote files. Key features include "live views" for growing files and an .ask sub-REPL translating natural language to SQL via a fast, local rules parser or optional Qwen2.5-Coder LLM, ensuring data privacy. A highly optimized WebAssembly (WASM) build offers significantly smaller sizes for edge computing and browser environments.
Quick Start & Requirements
Install via pip install slothdb (Python), npm install @slothdb/wasm (Node.js), or download the CLI binary. Python 3.8+ is recommended. A live playground is at https://slothdb.org/playground/, with documentation in docs/DOCUMENTATION.md. Demo: python -c "import slothdb; slothdb.demo()".
Highlighted Details
.ask) with local rules parser and optional LLM fallback (29 languages).Maintenance & Community
Active Discord community (discord.gg/XJWyGmX5G). Robust CI, comprehensive tests, and active maintainer involvement via GitHub issues and Discord.
Licensing & Compatibility
MIT license permits unrestricted use, modification, and distribution, including commercial, closed-source applications.
Limitations & Caveats
Single-node embedded engine; no distributed execution. Lacks multi-writer transactions (MVCC) and is not optimized for OLTP. No secondary indexes (scan-based execution). Partial window function coverage. Only anonymous public S3 access. Young codebase may have SQL edge cases.
5 days ago
Inactive
datalevin
dathere
timescale