Discover and explore top open-source AI tools and projects—updated daily.
LLM for modern Chinese to classical Chinese translation
Top 93.7% on SourcePulse
This project provides a large language model capable of translating modern Chinese sentences into classical Chinese (Wenyan). It is built upon the Xunzi base model and fine-tuned using LoRA on a parallel corpus of classical and modern Chinese texts, targeting researchers and developers working with historical Chinese linguistics or text generation.
How It Works
The model leverages LoRA (Low-Rank Adaptation) for efficient fine-tuning of a large base model. This approach allows for significant adaptation with a smaller number of trainable parameters compared to full fine-tuning, reducing computational cost and memory requirements. The training utilizes a parallel corpus of modern and classical Chinese texts to teach the model the stylistic and grammatical nuances of Wenyan.
Quick Start & Requirements
pip install -r requirements.txt
nvcc --version
).data/original
.config/config.py
.get_data.py
for data processing.finetune.py
(requires SwanLab API key for visualization).merge_and_push_model.py
.web_demo_local_inference.py
or Hugging Face's Serverless Inference API.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project requires manual setup of the base model and data preparation. Training visualization relies on an external service (SwanLab), and the absence of a specified license may impact commercial use or redistribution.
1 year ago
Inactive