Curated list of Text2SQL resources for LLMs, DSLs, APIs, and visualization
Top 16.4% on sourcepulse
This repository is a curated collection of resources, tutorials, and benchmarks for Text-to-SQL and related Natural Language Interface to Database (NLIDB) tasks. It targets researchers and practitioners in the NLP and database communities, providing a comprehensive overview of the field's advancements, models, datasets, and evaluation metrics.
How It Works
The project serves as a central hub for the Text-to-SQL ecosystem, categorizing and linking to key research papers, foundational LLMs (like Llama, Mistral, Qwen), fine-tuning techniques (LoRA, QLoRA, RLHF), and benchmark datasets (WikiSQL, Spider, BIRD-SQL). It highlights the evolution of Text-to-SQL from traditional methods to LLM-driven approaches, emphasizing performance metrics like Exact Match (EM) and Execution Accuracy (EX).
Quick Start & Requirements
This is a curated list of resources, not a runnable software package. To utilize the information, users will need to access the linked papers, code repositories, and datasets, which may have their own specific requirements (e.g., Python, deep learning frameworks, specific hardware for running LLMs).
Highlighted Details
Maintenance & Community
The repository is maintained by the eosphoros-ai organization, with a clear invitation for community contributions. Links to related projects like Awesome-AIGC-Tutorials and the organization's own focus on privacy-preserving LLM solutions are provided.
Licensing & Compatibility
The repository itself is a collection of links and information; the licensing of the linked resources (papers, code, datasets) varies and must be checked individually. This compilation is generally compatible with most research and commercial uses, provided the underlying linked resources permit it.
Limitations & Caveats
As a curated list, this repository does not provide a unified API or a single executable. Users must navigate and integrate the various linked resources independently. The rapidly evolving nature of LLMs means leaderboards and model performance can quickly become outdated.
1 month ago
1 day