phi3-Chinese  by CrazyBoyM

Chinese post-training models for Phi-3

Created 1 year ago
322 stars

Top 84.3% on SourcePulse

GitHubView on GitHub
Project Summary

This repository curates and shares fine-tuned Chinese versions of Microsoft's Phi-3 models, targeting developers and researchers interested in efficient, mobile-deployable LLMs. It aims to highlight lesser-known Phi-3 variants and provide tutorials for training and deployment, offering a potential alternative to larger models like Llama3 8B.

How It Works

The project focuses on collecting and presenting various fine-tuned Phi-3 models, particularly those with Chinese language capabilities. It links to specific fine-tuned versions available on platforms like ModelScope, including incremental SFT and DPO variants, and plans to explore vocabulary expansion. The core idea is to leverage Phi-3's small size (3.8B parameters) and reported performance advantages for practical deployment scenarios.

Quick Start & Requirements

  • Web Deployment: streamlit run deploy/streamlit_for_instruct.py ./Phi-3-mini-128k-instruct-Chinese
  • Prerequisites: Access to specific fine-tuned models (links provided to ModelScope and Hugging Face).
  • Resources: No specific hardware requirements are detailed, but the project emphasizes mobile deployability.

Highlighted Details

  • Focuses on Phi-3-mini, a 3.8B parameter model claimed to outperform Llama3 8B.
  • Provides links to Chinese fine-tuned versions (SFT, DPO) on ModelScope.
  • Discusses potential issues like performance discrepancies with benchmarks and small vocabulary size for Chinese.
  • Explores potential improvements through vocabulary expansion and further pre-training.

Maintenance & Community

The repository is maintained by CrazyBoyM. Links to ModelScope and Hugging Face are provided for model access. No specific community channels (Discord, Slack) or roadmap are mentioned.

Licensing & Compatibility

The README does not explicitly state a license for the curated models or the repository's code. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The author expresses disappointment with the actual performance of Phi-3-mini compared to benchmarks, suggesting potential benchmark manipulation or a need for further optimization. The small vocabulary size is noted as a significant drawback for Chinese language processing, leading to inefficient tokenization. The project is presented as potentially suitable for lightweight, vertical tasks rather than general-purpose chat.

Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.