ChatLaw  by PKU-YuanGroup

LLM for Chinese legal applications, research paper

created 2 years ago
7,296 stars

Top 7.2% on sourcepulse

GitHubView on GitHub
Project Summary

ChatLaw is a large language model-based multi-agent legal assistant designed for Chinese legal language processing. It aims to provide accessible legal consulting while mitigating AI hallucination risks through a Mixture-of-Experts (MoE) architecture, knowledge graphs, and a multi-agent system. The target audience includes legal professionals and individuals seeking legal advice, with the benefit of enhanced reliability and accuracy in AI-generated legal responses.

How It Works

ChatLaw employs a Mixture-of-Experts (MoE) model, specifically a 4x7B MoE design based on the InternLM architecture for its latest version (ChatLaw2-MoE). This approach allows different "experts" within the model to specialize in various legal domains, optimizing response accuracy. It integrates knowledge graphs and artificial screening to create a high-quality legal dataset for training. Standardized Operating Procedures (SOPs), mimicking law firm workflows, are implemented to minimize errors and hallucinations, further enhancing the system's reliability.

Quick Start & Requirements

  • Models are available on HuggingFace.
  • Specific hardware requirements (e.g., GPU, CUDA versions) are not detailed in the README.

Highlighted Details

  • Outperforms GPT-4 on Lawbench and the Unified Qualification Exam for Legal Professionals, achieving 7.73% higher accuracy and an 11-point lead, respectively.
  • ChatLaw-Text2Vec is a text similarity model trained on 93,000 court case decisions to match queries with relevant legal statutes.
  • Demonstrates superior performance across multiple legal categories and cognitive tasks compared to other models.
  • Multi-agent collaboration process is designed to mimic legal consultation workflows, culminating in a detailed Legal Consultation Report.

Maintenance & Community

  • The project is associated with PKU-YuanGroup.
  • A GitHub repository is provided for the project.

Licensing & Compatibility

  • The project is published on GitHub, implying a permissive license, but the specific license type is not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

  • The 33B version occasionally defaults to English responses due to limited Chinese training data in its base model (Anima-33B).
  • While claiming superior performance, specific hardware requirements for running the models are not provided.
Health Check
Last commit

7 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
96 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.