flan-alpaca by declare-lab

Instruction tuning code extends synthetic training

Created 2 years ago

357 stars

Top 78.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeremy Howard

Cofounder of fast.ai

Project Summary

This repository provides code and pretrained models for instruction tuning existing language models, specifically Flan-T5, using datasets like Alpaca and GPT4-Alpaca. It aims to make instruction-following capabilities more accessible and cost-effective, targeting researchers and developers working with large language models.

How It Works

The project leverages synthetic instruction data, generated by larger models like GPT-3, to fine-tune smaller, more accessible models such as Flan-T5. This approach allows for the transfer of instruction-following capabilities without the licensing constraints or computational demands of models like LLaMA. The code supports various data sources (Alpaca, GPT4-Alpaca, GPT4All, ShareGPT) and offers training scripts for different model sizes, including XL (3B) and XXL (11B) variants.

Quick Start & Requirements

Install: conda create -n paca python=3.8 -y, conda activate paca, pip install -r requirements.txt.
Data: Download alpaca_data.json, alpaca_data_cleaned.json, alpaca_gpt4_data.json from releases.
Prerequisites: Python 3.8, Conda, Hugging Face transformers, torch, pytorch-lightning. Training requires at least one A6000 GPU (4x A6000 for XXL models with FSDP).
Setup Time: Data download is quick; training time varies significantly by model size (e.g., 8 hours for XL on a single A6000).
Links: Hugging Face Models, Flan-Eval, Tango (Text-to-Audio).

Highlighted Details

Offers multiple pretrained Flan-Alpaca models ranging from 220M to 11B parameters.
Supports fine-tuning with Alpaca, GPT4-Alpaca, GPT4All, and ShareGPT datasets.
Includes scripts for data preprocessing, training (with FSDP support), inference, and exporting to Hugging Face Hub.
Mentions a "Flacuna" model (Vicuna-13B fine-tuned on Flan) that outperforms Vicuna in problem-solving.

Maintenance & Community

Developed by declare-lab.
No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license for the code or models. The underlying Alpaca data may have its own licensing.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on synthetic data, which may contain noise. The README does not detail specific performance benchmarks against other instruction-tuned models beyond claims about Flacuna. Licensing for the code and models requires clarification for commercial applications.

flan-alpaca by declare-lab

Explore Similar Projects

Awesome-instruction-tuning by zhilizju

LaCT by a1600012888

MAmmoTH by TIGER-AI-Lab

gritlm by ContextualAI

bonito by BatsResearch

cabrita by 22-hours

mini_qwen by qiufengqijun

codealpaca by sahil280114

OLMo-core by allenai

step_into_llm by mindspore-lab

magicoder by ise-uiuc

open-instruct by allenai