Discover and explore top open-source AI tools and projects—updated daily.
UCSC-VLAAMedical reasoning dataset and models for LLMs
Top 98.7% on SourcePulse
MedReason addresses the challenge of enabling faithful and explainable medical reasoning in Large Language Models (LLMs). It provides a large-scale dataset of 32,682 medical QA pairs, augmented with step-by-step reasoning paths derived from a knowledge graph. This resource aims to improve LLM performance on complex medical problem-solving tasks, benefiting researchers and developers in the medical AI domain.
How It Works
The project leverages a structured medical knowledge graph to systematically convert clinical question-answer pairs into detailed, logical "thinking paths." This automated pipeline generates a high-quality dataset for supervised finetuning (SFT). By training LLMs on these explicit reasoning steps, MedReason enhances their ability to provide accurate and explainable medical insights, moving beyond simple answer generation.
Quick Start & Requirements
UCSC-VLAA/MedReason-8B can be loaded directly using Hugging Face transformers (example provided). Deployment is supported via vllm or Sglang.accelerate and deepspeed (Zero3 configuration). Base models include HuatuoGPT-o1-8B and DeepSeek-R1-Distill-Llama-8B.Sglang for model deployment and evaluation scripts.transformers, torch, accelerate, deepspeed, sglang.Highlighted Details
MedReason-8B model achieves state-of-the-art performance on medical reasoning benchmarks.Maintenance & Community
The project acknowledges contributions from foundational models and tools like HuatuoGPT, trl, and sglang. No direct community channels (Discord/Slack) or explicit roadmap are provided in the README.
Licensing & Compatibility
The README does not specify a software license. This lack of clarity poses a significant adoption risk, particularly for commercial use or integration into proprietary systems.
Limitations & Caveats
The README does not detail known limitations, bugs, or alpha status. Data generation requires Azure API keys. Training and evaluation necessitate substantial hardware resources (e.g., 8-GPU clusters).
8 months ago
Inactive
epfLLM