Discover and explore top open-source AI tools and projects—updated daily.
arunpshankarLLM-powered Text-to-SQL architectures
Top 99.3% on SourcePulse
This repository provides a collection of architectural patterns for leveraging Large Language Models (LLMs) to efficiently generate SQL queries from natural language text. It targets engineers and researchers seeking to streamline database interactions by translating complex natural language questions into executable SQL, with a specific focus on enhancing BigQuery capabilities. The project offers practical implementations and a guide to various LLM-driven approaches for robust and performant Text-to-SQL generation.
How It Works
The project explores five distinct architectural patterns for Text-to-SQL. These include using LLMs for intent detection and entity extraction, integrating Retrieval-Augmented Generation (RAG) with schema metadata for context-aware query formulation, and employing autonomous SQL agents with iterative refinement capabilities. Advanced patterns focus on direct schema inference coupled with self-correction mechanisms that utilize execution feedback to resolve errors, and a stochastic optimization approach that selects the fastest executing query from multiple trials. This multi-pattern approach aims to enhance accuracy, robustness, and performance in LLM-based SQL generation.
Quick Start & Requirements
git clone https://github.com/arunpshankar/LLM-Text-to-SQL-Architectures.git
cd LLM-Text-to-SQL-Architectures
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
requirements.txt. Pattern III requires an ODBC connection for BigQuery. Pattern IV mentions the Code-Chat Bison model.CONTRIBUTING.md, and LICENSE.md.Highlighted Details
Maintenance & Community
Guidelines for contributions are available in CONTRIBUTING.md. The repository does not explicitly list community channels (e.g., Discord, Slack) or notable maintainers/sponsors in the provided README snippet.
Licensing & Compatibility
The project is licensed under the MIT License, which permits broad use, including commercial applications, with minimal restrictions beyond attribution.
Limitations & Caveats
A section on "Challenges and Limitations" is noted as "In Progress," indicating that potential pitfalls, areas for improvement, and known issues are documented but may be incomplete or primarily detailed in the linked external Medium article. Specific limitations are not detailed within the README snippet itself.
2 years ago
Inactive
cfahlgren1
microsoft
Canner