Medical GPT model training with ChatGPT pipeline
Top 95.2% on sourcepulse
AIDoctor provides a comprehensive pipeline for training a medical-focused GPT model, leveraging the LLaMA architecture. It targets researchers and developers aiming to build specialized medical AI assistants by implementing a full ChatGPT-like training process, including pre-training, supervised fine-tuning, and reinforcement learning from human feedback (RLHF) or direct preference optimization (DPO).
How It Works
AIDoctor follows a four-stage training methodology: 1) Continue Pretraining LLaMA on medical domain data to imbue specialized knowledge. 2) Supervised Fine-tuning (SFT) on medical Q&A datasets to align with instruction-following. 3) Reward Modeling (RM) to train a model that predicts human preferences ("helpful, honest, harmless") using ranked medical dialogue data. 4) Reinforcement Learning (RL) to optimize the SFT model using the RM, maximizing preferred outputs. This staged approach aims to systematically enhance the model's medical accuracy and alignment with user needs.
Quick Start & Requirements
python scripts/gradio_demo.py
for demo, sh scripts/run_pt.sh
, sh run_sft.sh
, sh run_rm.sh
, sh run_rl.sh
for training stages.Highlighted Details
Maintenance & Community
The project is actively developed by Jerry-XDL. Contributions are welcomed via Pull Requests with unit tests. Contact is available via GitHub Issues or email.
Licensing & Compatibility
The project code is licensed under Apache License 2.0, permitting commercial use. However, the model weights and data are restricted to research purposes only.
Limitations & Caveats
The SFT model may generate factually incorrect answers, struggle with harmful instructions, and has limitations in reasoning, coding, and multi-round dialogues. The model weights and data are strictly for research use and not for commercial applications or any purpose that could cause societal harm.
4 months ago
Inactive