LAW-GPT  by LiuHC0428

Chinese legal dialogue model fine-tuned from ChatGLM-6B

created 2 years ago
1,156 stars

Top 34.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides LAW-GPT (XieZhi), a Chinese legal large language model designed to offer professional and reliable answers to legal questions. It targets individuals facing legal issues, aiming to make legal information accessible and contribute to a more lawful society.

How It Works

LAW-GPT is built upon the ChatGLM-6B model, fine-tuned using LoRA with a 16-bit instruction tuning approach. Its training dataset includes existing legal Q&A datasets and high-quality legal text Q&A generated via self-Instruct, guided by statutes and real cases. This method enhances the model's performance in the legal domain, improving the reliability and professionalism of its responses, notably by providing statutory references.

Quick Start & Requirements

  • Installation: Clone the repository, navigate to src, and run pip install -r requirements.txt. The peft library requires local installation (cd peft && pip install -e .).
  • Prerequisites: Requires downloading ChatGLM-6B model parameters (into ./model), retrieval model parameters (into ./retriver), and text2vec-base-chinese model parameters (into ./text2vec-base-chinese).
  • Execution: Run CUDA_VISIBLE_DEVICES=$cuda_id python ./demo.py for basic interaction or CUDA_VISIBLE_DEVICES=$cuda_id python ./demo_r.py for retrieval-augmented interaction.
  • Hardware: Requires a single GPU with >= 15GB VRAM.
  • Links: Project Repository

Highlighted Details

  • Fine-tuned on a dataset of 92k scenario-based Q&A with legal references and 52k single-turn Q&A cleaned from the CrimeKgAssitant dataset.
  • Employs "Reliable-Self-Instruction" by providing specific legal texts to ChatGPT to generate contextually relevant questions and answers, ensuring accuracy.
  • Training utilizes model parallelism, achievable on a minimum of 4x 3090 GPUs for LoRA 16-bit fine-tuning.
  • Model outputs include statutory citations for increased reliability.

Maintenance & Community

  • Developed by four students from Shanghai Jiao Tong University, with guidance from Associate Professor Wang Yu.
  • The project is actively under development.

Licensing & Compatibility

  • The repository does not explicitly state a license. The citation format suggests a research-oriented release.
  • Users are advised to verify licensing for commercial or closed-source use.

Limitations & Caveats

The project's disclaimer states that the pre-trained model is for reference and research only, and its accuracy and reliability are not guaranteed. It explicitly warns against using the model for actual applications or decision-making, with users assuming all risks.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
24 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Feedback? Help us improve.