LLM-Pretrain-FineTune  by X-jun-0130

LLM pretraining and fine-tuning for medical dialogue

created 2 years ago
273 stars

Top 95.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a framework for pre-training and fine-tuning Large Language Models (LLMs), specifically demonstrating a medical dialogue model. It targets researchers and developers working with LLMs in specialized domains, offering a structured approach to leverage DeepSpeed for efficient training on limited hardware.

How It Works

The project utilizes DeepSpeed with Zero-3, CPU offload, and FP16 for memory optimization, enabling the training of large models on multi-GPU setups (e.g., 8x A6000). It outlines data processing strategies for both pre-training and fine-tuning, including text concatenation, slicing, and special token usage, referencing established practices from models like ChatHome and Llama 2. The fine-tuning process includes masking input labels for supervised learning and mentions experimental support for LoRA.

Quick Start & Requirements

  • Install: Requires PyTorch 1.13.1, DeepSpeed 0.7.5 (or 0.8.3), and Transformers 4.21.0 (or 4.28.1).
  • Hardware: Recommended 8x 48GB A6000 GPUs.
  • Configuration: Supports token lengths of 1024 and 2048 with specified batch sizes.
  • Projects: Includes scripts for pre-training (Model_Bloom_Pretrain.py), fine-tuning (Model_Bloom_Sft.py), model conversion, inference, and API serving.

Highlighted Details

  • Demonstrates pre-training and fine-tuning of a medical dialogue LLM.
  • Provides detailed data preprocessing steps for both pre-training and fine-tuning, including handling short texts, data formatting, and deduplication.
  • Includes examples of generating structured medical reports and question-answering pairs from text.
  • Mentions the potential to replace the base model (e.g., Bloom) with others like Llama.

Maintenance & Community

The project was updated on 20230926, announcing the open-sourcing of a WiNGPT model by Weining Health. Specific community links or active maintenance signals are not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README notes that LoRA fine-tuning currently only supports single-GPU setups and may have issues with multi-GPU configurations. It also highlights that many existing open-source medical LLMs may not meet the demands of practical applications.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

Medusa by FasterDecoding

0.2%
3k
Framework for accelerating LLM generation using multiple decoding heads
created 1 year ago
updated 1 year ago
Feedback? Help us improve.