XunziALLM by Xunzi-LLM-of-Chinese-classics

LLM for Chinese classics

Created 2 years ago

405 stars

Top 71.8% on SourcePulse

Project Summary

This project provides a suite of large language models (LLMs) specifically designed for processing and understanding classical Chinese texts. Targeting researchers, linguists, and enthusiasts of Chinese classics, these models offer advanced capabilities for information extraction, translation, and analysis of ancient literature, significantly aiding scholarly work and cultural exploration.

How It Works

The Xunzi series offers both base and chat models, built upon established open-source LLMs like Qwen, ChatGLM3, and Baichuan2. This approach leverages the robust architectures of these foundational models while specializing them for classical Chinese through targeted fine-tuning. This strategy allows for efficient development and provides users with familiar calling methods, similar to their base model counterparts.

Quick Start & Requirements

API Call Example: Uses the openai Python library.

from openai import OpenAI
openai_api_key = "ANY THING"
openai_api_base = "http://xunziallm.njau.edu.cn:21180/v1"
client = OpenAI(api_key=openai_api_key, base_url=openai_api_base)
chat_response = client.chat.completions.create(
    model="/home/gpu0/xunzi_web/Xunzi-Qwen1.5-7B_chat",
    messages=[{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": '...'}]
)
print(chat_response.choices[0].message.content)

Prerequisites: Python, openai library. A hosted API endpoint is provided for Xunzi-Qwen1.5-7B_chat.
Resources: Specific hardware requirements for self-hosting base models are not detailed but would align with the underlying Qwen/ChatGLM/Baichuan models.

Highlighted Details

Specialized Capabilities: Includes intelligent indexing, information extraction (people, events, places), poetry generation, high-quality translation, reading comprehension, lexical analysis (word segmentation, POS tagging), and automatic punctuation for classical texts.
Model Variety: Offers multiple models based on Qwen-7B, ChatGLM3-6B, Baichuan2-7B, Qwen1.5 (4B, 7B, 14B), and Qwen2 (1.5B, 7B), catering to different performance and resource needs.
Fine-tuning Potential: Base models are available for users to fine-tune with their own datasets for improved performance on specific downstream tasks.
API Access: Xunzi-Qwen1.5-7B_chat is accessible via an OpenAI-compatible API.

Maintenance & Community

Contact: zhaozhixiao@stu.njau.edu.cn for questions.
Support: Acknowledges ongoing development and welcomes feedback for future improvements.

Licensing & Compatibility

License: Not explicitly stated in the README.
Compatibility: Models are designed to be called using methods consistent with their base open-source counterparts (Qwen, ChatGLM3, Baichuan2, Qwen1.5, Qwen2).

Limitations & Caveats

The project acknowledges that models still have room for improvement and may contain unavoidable issues due to data and model complexity. The developers disclaim responsibility for any problems arising from data security, public opinion risks, or misuse of the models. Compliance with China's generative AI regulations is advised.

Health Check

Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days