dataherald  by Dataherald

NL-to-SQL engine for enterprise question answering over relational data

created 2 years ago
3,535 stars

Top 14.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Dataherald provides a natural language-to-SQL engine for enterprise data querying, enabling business users to extract insights from relational databases using plain English. It offers an API for direct database interaction and can be integrated into SaaS applications or used to create ChatGPT plugins for proprietary data.

How It Works

Dataherald leverages Large Language Models (LLMs) to translate natural language questions into SQL queries. The system is designed as a monorepo with distinct services: the core Engine for NL-to-SQL translation, the Enterprise API layer for authentication and business logic, an Admin Console for GUI configuration, and a Slackbot for interactive querying. This modular architecture allows for flexible deployment, from a simple engine-only setup to a full-featured enterprise solution.

Quick Start & Requirements

  • Install/Run: Use the sh docker-run.sh script from the root directory.
  • Prerequisites: Docker and Docker Compose. Each service requires environment variables to be set via .env files, based on .env.example files within each service directory.
  • Setup: Requires configuring environment variables for each service.

Highlighted Details

  • Enables Q&A directly from production databases.
  • Facilitates ChatGPT plugin creation from proprietary data.
  • Offers an API for seamless integration.
  • Includes an Admin Console for GUI configuration and observability.

Maintenance & Community

The project welcomes contributions for features, infrastructure, and documentation. Further details on contributing are available via a provided link.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The README does not specify the underlying LLM used or provide performance benchmarks. The setup requires careful configuration of environment variables for each service.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
64 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
3 more.

cohere-toolkit by cohere-ai

0.2%
3k
RAG toolkit for LLM application development and deployment
created 1 year ago
updated 1 week ago
Feedback? Help us improve.