dinobase  by DinobaseHQ

Agent-first data platform for unified data access and manipulation

Created 1 month ago
256 stars

Top 98.6% on SourcePulse

GitHubView on GitHub
Project Summary

Dinobase addresses the architectural limitations of agent stacks by providing a unified data platform. It enables agents to query diverse data sources (100+) via SQL, overcoming issues like data silos and cross-API JOINs. This results in significantly improved agent accuracy, speed, and cost-efficiency for complex data-driven tasks.

How It Works

The platform syncs data from numerous sources—APIs, databases, files, and MCP servers—into a queryable format, typically Parquet. It leverages DuckDB as the core query engine and metadata store, allowing agents to execute single SQL queries across disparate datasets. Dinobase integrates with the Machine Communication Protocol (MCP) for agent interaction and supports reverse ETL for data mutations via SQL with a preview/confirm mechanism. An optional semantic layer, powered by LLMs, annotates data for richer agent context.

Quick Start & Requirements

  • Install: Recommended: curl -fsSL https://dinobase.ai/install.sh | bash. Alternatives include uv tool install dinobase, pip install dinobase, or pipx install dinobase.
  • Prerequisites: API keys for data sources, cloud storage credentials (S3, GCS, Azure) if using cloud storage backend. Python environment.
  • Links: Docs, Getting Started, Connectors.

Highlighted Details

  • Performance Claims: Benchmarks show 91% accuracy (vs 35%), 3x faster, and 16-22x cheaper per correct answer compared to per-connector MCP tools across 11 LLMs.
  • Connector Ecosystem: Supports over 101 connectors spanning CRM, Billing, Support, Developer Tools, Databases, Cloud Storage, and more.
  • Reverse ETL: Enables agents to write data back to upstream sources through SQL mutations, featuring a crucial preview/confirm safety flow.
  • Semantic Layer: Optional LLM-driven annotation provides table descriptions, column documentation, and PII flagging for enhanced agent understanding.

Maintenance & Community

  • Community: Active Slack community available for support and discussion. Join Slack.
  • Development: Standard open-source development model via GitHub.

Licensing & Compatibility

  • License: MIT Expat.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

The README does not detail specific known bugs, alpha status, or unsupported platforms. The effectiveness of the semantic layer depends on the quality of the underlying LLM and API key configuration. Real-world performance and cost savings may vary based on data complexity and usage patterns.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
9
Issues (30d)
1
Star History
99 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Michael Chiang Michael Chiang(Cofounder of Ollama), and
2 more.

enrichmcp by featureform

0%
644
ORM for AI agents
Created 1 year ago
Updated 2 months ago
Starred by Chaoyu Yang Chaoyu Yang(Founder of Bento), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

DB-GPT by eosphoros-ai

0.3%
19k
AI-native data app development framework with agentic workflow
Created 3 years ago
Updated 2 days ago
Feedback? Help us improve.