phi3-Chinese by CrazyBoyM

Chinese post-training models for Phi-3

Created 1 year ago

324 stars

Top 84.1% on SourcePulse

Project Summary

This repository curates and shares fine-tuned Chinese versions of Microsoft's Phi-3 models, targeting developers and researchers interested in efficient, mobile-deployable LLMs. It aims to highlight lesser-known Phi-3 variants and provide tutorials for training and deployment, offering a potential alternative to larger models like Llama3 8B.

How It Works

The project focuses on collecting and presenting various fine-tuned Phi-3 models, particularly those with Chinese language capabilities. It links to specific fine-tuned versions available on platforms like ModelScope, including incremental SFT and DPO variants, and plans to explore vocabulary expansion. The core idea is to leverage Phi-3's small size (3.8B parameters) and reported performance advantages for practical deployment scenarios.

Quick Start & Requirements

Web Deployment: streamlit run deploy/streamlit_for_instruct.py ./Phi-3-mini-128k-instruct-Chinese
Prerequisites: Access to specific fine-tuned models (links provided to ModelScope and Hugging Face).
Resources: No specific hardware requirements are detailed, but the project emphasizes mobile deployability.

Highlighted Details

Focuses on Phi-3-mini, a 3.8B parameter model claimed to outperform Llama3 8B.
Provides links to Chinese fine-tuned versions (SFT, DPO) on ModelScope.
Discusses potential issues like performance discrepancies with benchmarks and small vocabulary size for Chinese.
Explores potential improvements through vocabulary expansion and further pre-training.

Maintenance & Community

The repository is maintained by CrazyBoyM. Links to ModelScope and Hugging Face are provided for model access. No specific community channels (Discord, Slack) or roadmap are mentioned.

Licensing & Compatibility

The README does not explicitly state a license for the curated models or the repository's code. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The author expresses disappointment with the actual performance of Phi-3-mini compared to benchmarks, suggesting potential benchmark manipulation or a need for further optimization. The small vocabulary size is noted as a significant drawback for Chinese language processing, leading to inefficient tokenization. The project is presented as potentially suitable for lightweight, vertical tasks rather than general-purpose chat.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days