Chinese LLaMA tuned for legal use
Top 8.7% on sourcepulse
LaWGPT is an open-source large language model series specifically tuned for Chinese legal knowledge. It aims to enhance understanding and execution capabilities within the legal domain for researchers and legal professionals. The project provides foundational models and instruction-tuned variants, offering improved semantic comprehension and task performance on legal texts.
How It Works
LaWGPT builds upon general Chinese LLMs like Chinese-LLaMA and ChatGLM. It expands the legal domain vocabulary and pre-trains on extensive Chinese legal corpora. Subsequently, it undergoes instruction fine-tuning using curated legal dialogue Q&A datasets and Chinese judicial examination datasets, improving its ability to understand and respond to legal queries.
Quick Start & Requirements
pip install -r requirements.txt
within a conda
environment (Python 3.10 recommended).scripts/webui.sh
) and command-line inference (scripts/infer.sh
).Highlighted Details
Maintenance & Community
The project is supported by the Nanjing University Machine Learning and Data Mining Research Group. Collaboration is encouraged via GitHub Issues.
Licensing & Compatibility
All resources are strictly for academic research use and are prohibited from commercial use. The project does not guarantee the accuracy of model outputs and strictly forbids their use in real legal scenarios.
Limitations & Caveats
Due to resource and data limitations, the models may exhibit weaker memory and language capabilities, potentially leading to factual inaccuracies. Initial human intent alignment is preliminary, meaning outputs might be unpredictable, harmful, or misaligned with human preferences. Self-awareness and Chinese comprehension require further enhancement.
1 year ago
1 day