ms-swift  by modelscope

SDK for fine-tuning and deploying LLMs/MLLMs

Created 2 years ago
9,935 stars

Top 5.1% on SourcePulse

GitHubView on GitHub
Project Summary

SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is a comprehensive framework for fine-tuning and deploying over 450 large language models (LLMs) and 150+ multi-modal large models (MLLMs). It targets researchers and developers working with LLMs and MLLMs, offering a unified pipeline from pre-training and fine-tuning to inference, evaluation, and deployment, significantly streamlining the MLOps lifecycle for these complex models.

How It Works

SWIFT supports a wide array of training techniques, including parameter-efficient fine-tuning (PEFT) methods like LoRA, QLoRA, DoRA, and GaLore, alongside full-parameter fine-tuning. It also integrates advanced human alignment methods such as DPO, GRPO, KTO, and ORPO. For inference and deployment, SWIFT leverages acceleration engines like vLLM and LMDeploy, and supports quantization via GPTQ and AWQ. The framework is designed for flexibility, allowing custom model and dataset integration, as well as component customization (loss, metrics, optimizers).

Quick Start & Requirements

  • Installation: pip install ms-swift -U or from source.
  • Python: >=3.9 (3.10 recommended).
  • Dependencies: PyTorch >=2.0, Transformers >=4.33, ModelScope >=1.19. Specific versions for DeepSpeed, vLLM, LMDeploy, and EvalScope are recommended for optimal performance. CUDA 12 is recommended for GPU acceleration.
  • Documentation: English Documentation

Highlighted Details

  • Supports over 150 multi-modal models and datasets, covering vision, audio, and video modalities.
  • Integrates 10+ human alignment algorithms (DPO, GRPO, KTO, etc.) for both LLMs and MLLMs.
  • Offers a Gradio-based Web UI for zero-threshold training and deployment.
  • Provides acceleration for inference, evaluation, and deployment using vLLM and LMDeploy.

Maintenance & Community

  • Active development with frequent updates and new feature additions (e.g., GRPO, Megatron support).
  • Community support available via Discord.
  • Paper accepted at AAAI 2025.

Licensing & Compatibility

  • Licensed under the Apache License (Version 2.0). Model and dataset licenses are subject to their original sources.
  • Compatible with commercial use, provided original model/dataset licenses are respected.

Limitations & Caveats

  • While supporting many hardware types, optimal performance may require specific GPU configurations (e.g., A100/H100 for large-scale training).
  • The extensive feature set implies a complex dependency graph; users should carefully manage installations for specific use cases.
Health Check
Last Commit

14 hours ago

Responsiveness

1 day

Pull Requests (30d)
149
Issues (30d)
507
Star History
610 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.0%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 2 years ago
Updated 1 day ago
Starred by Casper Hansen Casper Hansen(Author of AutoAWQ), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
5 more.

xtuner by InternLM

0.5%
5k
LLM fine-tuning toolkit for research
Created 2 years ago
Updated 1 day ago
Starred by Tony Lee Tony Lee(Author of HELM; Research Engineer at Meta), Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), and
24 more.

LLaMA-Factory by hiyouga

1.1%
58k
Unified fine-tuning tool for 100+ LLMs & VLMs (ACL 2024)
Created 2 years ago
Updated 2 days ago
Feedback? Help us improve.