llamaduo by deep-diver

LLMOps pipeline to fine-tune small LLMs for service LLM outage prep

Created 1 year ago

314 stars

Top 86.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Omar Sanseviero

DevRel at Google DeepMind

Wing Lian

Founder of Axolotl AI

Project Summary

LlamaDuo provides an LLMOps pipeline for fine-tuning small-scale local LLMs to replicate the performance of larger service LLMs, addressing potential service outages, data privacy concerns, and offline requirements. It targets developers and organizations seeking to migrate from cloud-based LLM services to self-hosted, cost-effective alternatives.

How It Works

The pipeline leverages Hugging Face's ecosystem for model management and fine-tuning. It uses powerful service LLMs (GPT-4o, Claude 3 Sonnet, Gemini 1.5 Flash) for synthetic data generation and evaluation. The core approach involves fine-tuning smaller LLMs (Gemma, Mistral, LLaMA) on prompt-response pairs, potentially augmented with synthetically generated data, to match the behavior of the service LLMs. This method allows for controlled migration and maintains desired output quality.

Quick Start & Requirements

Install via pip install genai-apis[gemini] (and other providers as needed).
Requires Hugging Face Hub authentication (huggingface-cli login).
Service LLM API keys and potentially cloud credentials (GCP/AWS) are necessary for evaluation and data generation steps.
Fine-tuning configuration is managed via YAML files.
See official documentation for detailed setup and usage.

Highlighted Details

Supports multiple service LLMs (OpenAI, Gemini, Anthropic) with concurrency and rate-limiting for API calls.
Utilizes Hugging Face's alignment-handbook for streamlined fine-tuning.
Outputs are pushed to Hugging Face Datasets for versioning and collaboration.
Includes scripts for batch inference, evaluation, synthetic data generation, and dataset merging.

Maintenance & Community

This project was developed during Google's ML Developer Programs sprints. Community interaction and support are available via GitHub issues.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification on the license.

Limitations & Caveats

The project is presented as a template rather than a library, requiring users to adapt it to their specific use cases. Customization of evaluation metrics and fine-tuning configurations is expected. The README does not specify the license, which could impact commercial adoption.

Health Check

Last Commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days