PMC-LLaMA  by chaoyi-wu

Medical LLM for instruction-following in the medical domain

created 2 years ago
658 stars

Top 51.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official code and models for PMC-LLaMA, a family of open-source Large Language Models specifically designed for the medical domain. It aims to improve medical question answering and instruction following by pre-training on a large medical corpus and fine-tuning with instruction datasets, offering a specialized alternative to general-purpose LLMs for medical professionals and researchers.

How It Works

PMC-LLaMA follows a two-stage approach: first, pre-training a base LLaMA model on a vast collection of medical literature (PubMed Central papers and medical books), and second, fine-tuning this pre-trained model on an instruction-following dataset. This domain-specific pre-training is crucial for imbuing the model with medical knowledge, while instruction tuning enhances its ability to understand and respond to medical queries and tasks.

Quick Start & Requirements

  • Install:
    conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
    pip install transformers==4.28.1 sentencepiece datasets
    
  • Prerequisites: PyTorch with CUDA 11.6, transformers, sentencepiece, datasets.
  • Usage: Load models via Hugging Face transformers library (e.g., axiong/PMC_LLaMA_13B). See simple_test.py for examples.
  • Links: Hugging Face Models, PMC LLaMA Instructions

Highlighted Details

  • PMC_LLaMA_13B achieves 56.36 on USMLE, 56.04 on MedMCQA, and 77.9 on PubMedQA, outperforming LLaMA-2 and other medical LLMs on these benchmarks.
  • The project also released MedLLaMA_13B, pre-trained on 4.8M PubMed Central papers and medical books.
  • A new metric, RaTEScore, for evaluating generative medical foundation models has been introduced.
  • The project provides scripts for both pre-training and instruction tuning.

Maintenance & Community

Licensing & Compatibility

  • The models are based on LLaMA, which has its own license. The code itself appears to be under a permissive license, but specific details are not explicitly stated for the code. The medical books used for pre-training are not released due to licensing.

Limitations & Caveats

  • The raw content of the medical books used for pre-training is not released due to licensing restrictions, requiring users to acquire and process them for reproduction.
  • Some older models (MedLLaMA_13B) might generate citation numbers due to training on scientific papers.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.