Discover and explore top open-source AI tools and projects—updated daily.
Advancing LLM-based Text-to-SQL generation
Top 76.2% on SourcePulse
This repository serves as a comprehensive, continuously updated catalog of resources for Large Language Model (LLM)-based Text-to-SQL. It targets researchers and practitioners in database interfaces and natural language processing, offering a curated collection of surveys, papers, benchmarks, datasets, and open-source projects to accelerate development and understanding in this rapidly evolving field. The project is anchored by a survey paper accepted by IEEE TKDE in 2025.
How It Works
The fundamental workflow involves an LLM processing a natural language question in conjunction with the relevant database schema. The LLM then generates an executable SQL query. This generated query is subsequently executed against the target database to retrieve the precise results needed to answer the user's original question, aiming to create more intuitive and powerful database interaction interfaces.
Quick Start & Requirements
This repository functions as a curated index and does not provide a direct installation or execution command. Users are directed to the linked papers and projects for specific setup instructions, dependencies (e.g., Python versions, GPU requirements, CUDA), and execution details. Key resources include links to numerous survey papers, prominent benchmarks like BIRD and Spider (versions 1.0 and 2.0), and a wide array of original and post-annotated datasets.
Highlighted Details
Maintenance & Community
The repository is actively maintained and updated, with its foundation in a survey paper accepted by IEEE TKDE in 2025. Contributions are actively welcomed via GitHub issues and pull requests, fostering community engagement. Direct contact is available via email at zijin[dot]hong[at]connect[dot]polyu[dot]hk.
Licensing & Compatibility
The repository itself does not specify a license. Users are strongly advised to refer to the individual linked papers and projects for their respective licensing terms and compatibility restrictions, particularly concerning commercial use or integration into closed-source systems.
Limitations & Caveats
As a curated list, this repository does not offer direct functionality; it serves as an index and pointer to external research resources. The absence of an explicit license for the repository itself necessitates careful review of all linked external projects for their specific usage rights. The content is primarily research-oriented, with a focus on advancements reported up to late 2025.
6 days ago
Inactive