ancient_text_generation_LLM by JianXiao2021

LLM for modern Chinese to classical Chinese translation

Created 1 year ago

280 stars

Top 93.1% on SourcePulse

Project Summary

This project provides a large language model capable of translating modern Chinese sentences into classical Chinese (Wenyan). It is built upon the Xunzi base model and fine-tuned using LoRA on a parallel corpus of classical and modern Chinese texts, targeting researchers and developers working with historical Chinese linguistics or text generation.

How It Works

The model leverages LoRA (Low-Rank Adaptation) for efficient fine-tuning of a large base model. This approach allows for significant adaptation with a smaller number of trainable parameters compared to full fine-tuning, reducing computational cost and memory requirements. The training utilizes a parallel corpus of modern and classical Chinese texts to teach the model the stylistic and grammatical nuances of Wenyan.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
PyTorch with CUDA support is required; install based on your CUDA version (check with nvcc --version).
Download the base model.
Prepare data in data/original.
Configure paths and settings in config/config.py.
Run get_data.py for data processing.
Fine-tune using finetune.py (requires SwanLab API key for visualization).
Merge and export the model with merge_and_push_model.py.
Inference can be done via web_demo_local_inference.py or Hugging Face's Serverless Inference API.
Demo: https://modelscope.cn/studios/chostem/ancient_Chinese_text_generator, https://huggingface.co/spaces/cofeg/ancient_Chinese_text_generator_1.5B

Highlighted Details

LoRA fine-tuning workflow provided.
Training process visualization via SwanLab.
Option to resume training from checkpoints.
Model merging and pushing to Hugging Face supported.
Local inference demo included.

Maintenance & Community

No specific contributors, sponsorships, or community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

The project requires manual setup of the base model and data preparation. Training visualization relies on an external service (SwanLab), and the absence of a specified license may impact commercial use or redistribution.

ancient_text_generation_LLM by JianXiao2021

Explore Similar Projects

BayLing by ictnlp

llama3-chinese by seanzhang-zhichen

huozi by HIT-SCIR

transformers_zh_docs by liuzard

LLaMa2lang by SensAI-PT

text2text by artitw

Modelscope_Faster_Whisper_Multi_Subtitle by v3ucn

bert-japanese by cl-tohoku

XrayGLM by WangRongsheng

LLMs-from-scratch-CN by MLNLP-World

Chinese-BERT-wwm by ymcui

GPT2-Chinese by Morizeyao