dataherald by Dataherald

NL-to-SQL engine for enterprise question answering over relational data

Created 2 years ago

3,619 stars

Top 13.2% on SourcePulse

View on GitHub

3 Experts Love This Project

Cofounder of Langfuse

Project Summary

Dataherald provides a natural language-to-SQL engine for enterprise data querying, enabling business users to extract insights from relational databases using plain English. It offers an API for direct database interaction and can be integrated into SaaS applications or used to create ChatGPT plugins for proprietary data.

How It Works

Dataherald leverages Large Language Models (LLMs) to translate natural language questions into SQL queries. The system is designed as a monorepo with distinct services: the core Engine for NL-to-SQL translation, the Enterprise API layer for authentication and business logic, an Admin Console for GUI configuration, and a Slackbot for interactive querying. This modular architecture allows for flexible deployment, from a simple engine-only setup to a full-featured enterprise solution.

Quick Start & Requirements

Install/Run: Use the sh docker-run.sh script from the root directory.
Prerequisites: Docker and Docker Compose. Each service requires environment variables to be set via .env files, based on .env.example files within each service directory.
Setup: Requires configuring environment variables for each service.

Highlighted Details

Enables Q&A directly from production databases.
Facilitates ChatGPT plugin creation from proprietary data.
Offers an API for seamless integration.
Includes an Admin Console for GUI configuration and observability.

Maintenance & Community

The project welcomes contributions for features, infrastructure, and documentation. Further details on contributing are available via a provided link.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The README does not specify the underlying LLM used or provide performance benchmarks. The setup requires careful configuration of environment variables for each service.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

17 stars in the last 30 days