Research paper for multi-agent collaborative text-to-SQL framework
Top 95.8% on sourcepulse
MAC-SQL is a multi-agent framework designed to tackle the Text-to-SQL problem, enabling more accurate and robust SQL query generation from natural language. It is targeted at researchers and developers working on natural language understanding and database interaction, offering a collaborative approach to improve performance on complex queries.
How It Works
The framework employs a three-agent collaborative system: a Selector, a Decomposer, and a Refiner. This architecture allows for specialized processing of natural language questions and database schemas. The Selector identifies relevant database tables and columns, the Decomposer breaks down complex questions into simpler SQL sub-queries, and the Refiner synthesizes these into a final, executable SQL query. This modular design aims to improve accuracy and handle intricate query structures more effectively than monolithic approaches.
Quick Start & Requirements
conda create -n macsql python=3.9 -y
, conda activate macsql
, pip install -r requirements.txt
, python -c "import nltk; nltk.download('punkt')"
punkt
tokenizer, OpenAI API access (GPT-4-1106-preview default). Requires setting OPENAI_API_BASE
and OPENAI_API_KEY
environment variables.data.zip
(BIRD and Spider datasets) from provided Baidu Disk or Google Drive links and replace the existing data
folder.scripts/app_bird.py
or scripts/app_spider.py
for SQL execution demos.Highlighted Details
bad_cases
folder with examples of challenging queries.Maintenance & Community
The project is associated with authors from various institutions and has been accepted to COLING 2025. No specific community channels like Discord or Slack are mentioned in the README.
Licensing & Compatibility
The repository does not explicitly state a license. The code is provided for research purposes, and commercial use or closed-source linking compatibility is not specified.
Limitations & Caveats
The framework relies heavily on OpenAI's API, specifically older versions (e.g., openai==0.28.1
), and requires careful configuration of API keys and endpoints. The default model is GPT-4-1106-preview, and running with local models requires specific deployment steps.
5 months ago
1 day