AIDoctor  by Jerry-XDL

Medical GPT model training with ChatGPT pipeline

created 5 months ago
274 stars

Top 95.2% on sourcepulse

GitHubView on GitHub
Project Summary

AIDoctor provides a comprehensive pipeline for training a medical-focused GPT model, leveraging the LLaMA architecture. It targets researchers and developers aiming to build specialized medical AI assistants by implementing a full ChatGPT-like training process, including pre-training, supervised fine-tuning, and reinforcement learning from human feedback (RLHF) or direct preference optimization (DPO).

How It Works

AIDoctor follows a four-stage training methodology: 1) Continue Pretraining LLaMA on medical domain data to imbue specialized knowledge. 2) Supervised Fine-tuning (SFT) on medical Q&A datasets to align with instruction-following. 3) Reward Modeling (RM) to train a model that predicts human preferences ("helpful, honest, harmless") using ranked medical dialogue data. 4) Reinforcement Learning (RL) to optimize the SFT model using the RM, maximizing preferred outputs. This staged approach aims to systematically enhance the model's medical accuracy and alignment with user needs.

Quick Start & Requirements

  • Install/Run: python scripts/gradio_demo.py for demo, sh scripts/run_pt.sh, sh run_sft.sh, sh run_rm.sh, sh run_rl.sh for training stages.
  • Prerequisites: LLaMA model weights (HF format), medical datasets (e.g., shibing624/medical, FreedomIntelligence/HuatuoGPT-sft-data-v1), Python, PyTorch. GPU acceleration is highly recommended for training and inference.
  • Resources: Training requires significant computational resources (GPU memory, time) and large datasets.
  • Links: Hugging Face Demo, Datasets

Highlighted Details

  • Implements a full ChatGPT training pipeline (PT, SFT, RM, RLHF/DPO).
  • Utilizes a 2.4 million Chinese medical dataset for training.
  • Supports LLaMA models and LoRA fine-tuning.
  • Includes a Gradio-based demo for interactive inference.

Maintenance & Community

The project is actively developed by Jerry-XDL. Contributions are welcomed via Pull Requests with unit tests. Contact is available via GitHub Issues or email.

Licensing & Compatibility

The project code is licensed under Apache License 2.0, permitting commercial use. However, the model weights and data are restricted to research purposes only.

Limitations & Caveats

The SFT model may generate factually incorrect answers, struggle with harmful instructions, and has limitations in reasoning, coding, and multi-round dialogues. The model weights and data are strictly for research use and not for commercial applications or any purpose that could cause societal harm.

Health Check
Last commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.