LLM-Finetune by Zeyi-Lin

Finetuning scripts for LLMs

Created 1 year ago

588 stars

Top 55.2% on SourcePulse

Project Summary

This repository provides scripts and instructions for fine-tuning large language models (LLMs), specifically focusing on Qwen2-VL, Qwen2, and GLM4 models for tasks like text classification, named entity recognition, and multimodal fine-tuning. It is targeted at researchers and developers working with these specific LLM architectures who need a streamlined process for adapting them to custom datasets.

How It Works

The project utilizes a straightforward fine-tuning approach, likely employing standard supervised fine-tuning (SFT) techniques. It provides separate Python scripts for each model and task combination, abstracting away much of the underlying training loop complexity. The inclusion of specific dataset download instructions and commands for each task streamlines the setup and execution process.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Datasets for text classification (Fudan News) and Named Entity Recognition (CCFBDCI) need to be downloaded and placed in the root directory.
Qwen2-VL multimodal fine-tuning requires additional data preparation steps within the qwen2_vl directory.
Official documentation or demo links are not explicitly provided, but Jupyter Notebooks are mentioned for some tasks.

Highlighted Details

Supports fine-tuning for Qwen2-VL (multimodal), Qwen2 (text classification, NER), and GLM4 (text classification, NER).
Includes specific training scripts for each model-task combination.
Provides inference scripts for Qwen2, Qwen2-VL, and GLM4 models.
Mentions integration with SwanLab for experiment tracking and visualization.

Maintenance & Community

No information on contributors, sponsorships, community channels, or roadmap is available in the README.

Licensing & Compatibility

The license is not specified in the README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The README does not specify the exact LLM architectures or versions supported beyond the model names, nor does it detail hardware requirements (e.g., GPU, VRAM). The project appears to be experimental, with no explicit mention of stability or production readiness.

LLM-Finetune by Zeyi-Lin

Explore Similar Projects

SuperAdapters by cckuailong

MedTrinity-25M by UCSC-VLAA

Zero-Chatgpt by AI-Study-Han

LLM-Pretrain-FineTune by X-jun-0130

MedQA-ChatGLM by WangRongsheng

finetune by IndicoDataSolutions

molmo by allenai

chatglm_finetuning by ssbuild

xtuner by InternLM

BELLE by LianjiaTech

ms-swift by modelscope

lm-evaluation-harness by EleutherAI