MMedLM  by MAGIC-AI4Med

Multilingual language model for medicine (research paper & models)

created 1 year ago
264 stars

Top 97.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official code and models for "Towards Building Multilingual Language Model for Medicine," a project focused on creating open-source, multilingual LLMs for the medical domain. It offers a large multilingual medical corpus (MMedC), a medical question-answering benchmark (MMedBench), and several pre-trained and fine-tuned models, including MMed-Llama3.1-70B, which rivals GPT-4 performance across multiple languages.

How It Works

The project's approach involves constructing a substantial multilingual medical corpus (MMedC) of 25.5 billion tokens across six languages for auto-regressive pre-training of general LLMs. It also introduces MMedBench, a multilingual medical multiple-choice QA benchmark with rationales, to evaluate and monitor model progress. The models are then further trained or fine-tuned on these resources, demonstrating significant performance gains over existing open-source medical LLMs and competitive results against proprietary models like GPT-4.

Quick Start & Requirements

  • Installation: Code is provided in folders for pre-training, fine-tuning, and inference. Specific dependencies include PyTorch 1.13 and Transformers 4.37. For LoRA fine-tuning, the PEFT library is required.
  • Hardware: Auto-regressive training on MMedC requires at least 8 A100 80GB GPUs and extended training periods (over a month). Inference and fine-tuning can be adapted for single machines by removing Slurm commands.
  • Resources: The project offers models of various sizes (1.8B, 7B, 8B, 70B parameters).
  • Links: Paper (Arxiv): https://arxiv.org/abs/2402.13963, Leaderboard: https://github.com/MAGIC-AI4Med/MMedLM/blob/main/leaderboard.md

Highlighted Details

  • MMed-Llama3.1-70B achieves 80.51 on MMedBench, outperforming GPT-4 (74.27) and supporting 8 languages.
  • MMedLM 2 (7B) rivals GPT-4 on MMedBench.
  • MMed-Llama 3 (8B) shows superior performance on English benchmarks like MedQA (65.4) and MMedBench (79.25) compared to Llama 3 (60.9 and 63.86 respectively).
  • The project releases the data collection pipeline, including filtering and OCR code.

Maintenance & Community

  • The project is associated with Nature Communications and has active releases, including recent models like MMed-Llama3.1-70B.
  • Contact: qiupengcheng@pjlab.org.cn.

Licensing & Compatibility

  • The repository is released under the Apache 2.0 license.
  • Compatibility for commercial use is generally permissive due to the Apache 2.0 license.

Limitations & Caveats

  • Full auto-regressive training on the MMedC corpus is computationally intensive, requiring significant GPU resources and time.
  • While open-source models are fine-tuned on the MMedBench trainset before evaluation, proprietary models like GPT-3.5/4 and Gemini are evaluated zero-shot via API.
Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 90 days

Explore Similar Projects

Starred by Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code), Daniel Han Daniel Han(Cofounder of Unsloth), and
4 more.

open-instruct by allenai

0.2%
3k
Training codebase for instruction-following language models
created 2 years ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.