LLM-Finetune  by Zeyi-Lin

Finetuning scripts for LLMs

created 1 year ago
458 stars

Top 67.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides scripts and instructions for fine-tuning large language models (LLMs), specifically focusing on Qwen2-VL, Qwen2, and GLM4 models for tasks like text classification, named entity recognition, and multimodal fine-tuning. It is targeted at researchers and developers working with these specific LLM architectures who need a streamlined process for adapting them to custom datasets.

How It Works

The project utilizes a straightforward fine-tuning approach, likely employing standard supervised fine-tuning (SFT) techniques. It provides separate Python scripts for each model and task combination, abstracting away much of the underlying training loop complexity. The inclusion of specific dataset download instructions and commands for each task streamlines the setup and execution process.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Datasets for text classification (Fudan News) and Named Entity Recognition (CCFBDCI) need to be downloaded and placed in the root directory.
  • Qwen2-VL multimodal fine-tuning requires additional data preparation steps within the qwen2_vl directory.
  • Official documentation or demo links are not explicitly provided, but Jupyter Notebooks are mentioned for some tasks.

Highlighted Details

  • Supports fine-tuning for Qwen2-VL (multimodal), Qwen2 (text classification, NER), and GLM4 (text classification, NER).
  • Includes specific training scripts for each model-task combination.
  • Provides inference scripts for Qwen2, Qwen2-VL, and GLM4 models.
  • Mentions integration with SwanLab for experiment tracking and visualization.

Maintenance & Community

No information on contributors, sponsorships, community channels, or roadmap is available in the README.

Licensing & Compatibility

The license is not specified in the README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The README does not specify the exact LLM architectures or versions supported beyond the model names, nor does it detail hardware requirements (e.g., GPU, VRAM). The project appears to be experimental, with no explicit mention of stability or production readiness.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
91 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
2 more.

maestro by roboflow

0.1%
3k
CLI/SDK for fine-tuning multimodal models
created 1 year ago
updated 5 days ago
Feedback? Help us improve.