This repository provides a framework for enhancing Text-to-SQL capabilities using Large Language Models (LLMs). It offers a comprehensive workflow for data processing, model fine-tuning (SFT), prediction, and evaluation, aiming to reduce training costs and improve accuracy for database querying via natural language. The target audience includes researchers and developers working on Text-to-SQL solutions.
How It Works
DB-GPT-Hub leverages Supervised Fine-Tuning (SFT) on various LLMs, including CodeLlama, Llama2, and Qwen, using techniques like LoRA and QLoRA. It processes datasets such as Spider, WikiSQL, and BIRD-SQL, employing an information matching generation method that combines table information with natural language queries to produce accurate SQL. The framework supports multiple fine-tuning and prediction methods, with a focus on optimizing performance and reducing computational requirements.
Quick Start & Requirements
- Install:
pip install dbgpt-hub
- Prerequisites: Python 3.10, Git. Fine-tuning requires significant GPU RAM (e.g., 6GB for 7B models, 13.4GB for 13B models) and disk space.
- Setup: Clone the repository, create a conda environment, and install dependencies. Data preprocessing involves running a shell script.
- Docs: Official Docs
Highlighted Details
- Supports fine-tuning for Text-to-SQL, Text-to-NLU, and Text-to-GQL.
- Achieved 0.764 execution accuracy on a 1.27G database with a fine-tuned 13B model (zero-shot).
- Offers fine-tuning via LoRA and QLoRA, with configurable parameters for various LLM architectures.
- Includes scripts for data preprocessing, training, prediction, evaluation, and model weight merging.
Maintenance & Community
- Active community with Discord and WeChat channels for support and contributions.
- Regular updates and roadmap outlining future development, including inference optimization and Chinese language support.
- Welcomes contributions via issues and pull requests.
Licensing & Compatibility
- MIT License. Permissive for commercial use and integration with closed-source projects.
Limitations & Caveats
- The project is described as experimental.
- Performance benchmarks are provided for specific models and datasets; results may vary with different configurations or databases.
- Some advanced features like DeepSpeed multi-GPU training require specific configuration adjustments.