Walkthrough for fine-tuning LLaMa 2 7B on Text-to-SQL datasets
Top 86.5% on sourcepulse
This repository provides a walkthrough for fine-tuning LLaMa 2 7B on a Text-to-SQL dataset and performing inference against databases using LlamaIndex. It is targeted at developers and researchers looking to build custom Text-to-SQL applications.
How It Works
The project leverages LlamaIndex for database interaction and Hugging Face's datasets
and peft
libraries for efficient fine-tuning of LLaMa 2. The fine-tuning process is designed to be modular and runnable via Modal, a cloud-native development framework, simplifying distributed training and deployment.
Quick Start & Requirements
git clone https://github.com/run-llama/modal_finetune_sql.git
and cd modal_finetune_sql
.tutorial.ipynb
notebook or individual steps:
modal run src.load_data_sql
modal run --detach src.finetune_sql
modal run src.inference_sql_llamaindex::main --query "Which city has the highest population?" --sqlite-file-path "nbs/cities.db"
Highlighted Details
Maintenance & Community
The project is part of the run-llama
organization, suggesting active development in the Llama ecosystem. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
The repository's licensing is not specified in the README. Compatibility for commercial use or closed-source linking would require clarification.
Limitations & Caveats
The README indicates that the code is adapted from another repository and provides a walkthrough, suggesting it may be experimental or a proof-of-concept. Specific performance benchmarks or production-readiness claims are not present.
1 year ago
1 day