Discover and explore top open-source AI tools and projects—updated daily.
michael-wzhuChinese instruction-tuning dataset for multi-task, few-shot medical NLP
Top 74.6% on SourcePulse
PromptCBLUE is a benchmark dataset and evaluation framework for large language models (LLMs) in the Chinese medical domain. It transforms 16 existing CBLUE tasks into prompt-based generation tasks, aiming to standardize LLM evaluation in medical NLP. The project targets researchers and developers working with LLMs in healthcare, providing a unified platform for assessing model performance on diverse medical NLP challenges.
How It Works
PromptCBLUE reformulates 16 medical NLP tasks from the CBLUE benchmark into a prompt-based generation format. Each task is converted into an input, target, type, and answer_choices structure, suitable for LLM processing. This approach allows for a unified evaluation of LLMs across various medical NLP tasks, leveraging the prompt-engineering paradigm.
Quick Start & Requirements
datasets/toy_examples.test_predictions.json file and a post_generate_process.py script (Python standard library only) for evaluation.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
Inactive
zhenbench