PMC-LLaMA by chaoyi-wu

Medical LLM for instruction-following in the medical domain

Created 2 years ago

673 stars

Top 50.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Wing Lian

Founder of Axolotl AI

Project Summary

This repository provides the official code and models for PMC-LLaMA, a family of open-source Large Language Models specifically designed for the medical domain. It aims to improve medical question answering and instruction following by pre-training on a large medical corpus and fine-tuning with instruction datasets, offering a specialized alternative to general-purpose LLMs for medical professionals and researchers.

How It Works

PMC-LLaMA follows a two-stage approach: first, pre-training a base LLaMA model on a vast collection of medical literature (PubMed Central papers and medical books), and second, fine-tuning this pre-trained model on an instruction-following dataset. This domain-specific pre-training is crucial for imbuing the model with medical knowledge, while instruction tuning enhances its ability to understand and respond to medical queries and tasks.

Quick Start & Requirements

Install:

conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
pip install transformers==4.28.1 sentencepiece datasets

Prerequisites: PyTorch with CUDA 11.6, transformers, sentencepiece, datasets.
Usage: Load models via Hugging Face transformers library (e.g., axiong/PMC_LLaMA_13B). See simple_test.py for examples.
Links: Hugging Face Models, PMC LLaMA Instructions

Highlighted Details

PMC_LLaMA_13B achieves 56.36 on USMLE, 56.04 on MedMCQA, and 77.9 on PubMedQA, outperforming LLaMA-2 and other medical LLMs on these benchmarks.
The project also released MedLLaMA_13B, pre-trained on 4.8M PubMed Central papers and medical books.
A new metric, RaTEScore, for evaluating generative medical foundation models has been introduced.
The project provides scripts for both pre-training and instruction tuning.

Maintenance & Community

The project is associated with the MAGIC-AI4Med initiative.
Contact email: wtzxxxwcy02@sjtu.edu.cn.

Licensing & Compatibility

The models are based on LLaMA, which has its own license. The code itself appears to be under a permissive license, but specific details are not explicitly stated for the code. The medical books used for pre-training are not released due to licensing.

Limitations & Caveats

The raw content of the medical books used for pre-training is not released due to licensing restrictions, requiring users to acquire and process them for reproduction.
Some older models (MedLLaMA_13B) might generate citation numbers due to training on scientific papers.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days