law-cn-ai by lvwzhen

AI legal assistant for Chinese law leveraging vector search

Created 2 years ago

4,909 stars

Top 10.1% on SourcePulse

Project Summary

This project provides an AI legal assistant that leverages custom knowledge bases to enhance OpenAI's text completion capabilities. It's designed for developers and researchers looking to build domain-specific AI applications, offering a streamlined approach to integrating external data with large language models.

How It Works

The system follows a four-step process, separating knowledge base preprocessing from runtime query execution. At build time, .mdx files are chunked, converted into embeddings using OpenAI's API, and stored in a PostgreSQL database with the pgvector extension. Checksums are used to efficiently update embeddings only when source files change. At runtime, user queries are also embedded, and a similarity search is performed against the stored vectors to retrieve relevant document chunks. These chunks are then injected into an OpenAI GPT-3 prompt, and the response is streamed back to the client.

Quick Start & Requirements

Install/Run: Deploy to Vercel. Local development requires Docker for Supabase.
Prerequisites: OpenAI API Key, Node.js, pnpm, Docker.
Setup: Local setup involves configuring .env with OPENAI_KEY, starting Supabase via npx supabase start, and running the Next.js app with pnpm dev.
Links: Docs, pgvector, YouTube

Highlighted Details

Utilizes pgvector for efficient similarity search within PostgreSQL.
Implements a build-time embedding generation process for optimized knowledge base updates.
Streams responses from OpenAI for a more interactive user experience.
Leverages Vercel for simplified deployment and Supabase for database management.

Maintenance & Community

The project is based on a Supabase community template. Further community engagement or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not specified in the README. Compatibility for commercial use or closed-source linking would depend on the underlying licenses of its dependencies (OpenAI API, Next.js, Supabase, etc.).

Limitations & Caveats

The project relies heavily on the OpenAI API, incurring associated costs. The effectiveness of the legal assistant is directly tied to the quality and comprehensiveness of the provided .mdx legal documents.

law-cn-ai by lvwzhen

Explore Similar Projects

pgvector-go by pgvector

askaitools-community-edition by askaitools

pg_vectorize by ChuckHend

wait-but-why-gpt by mckaywrigley

yt-semantic-search by transitive-bullshit

ai-template by Jordan-Gilliam

DataChad by gustavz

docs-mcp-server by arabold

chatgpt-pgvector by gannonh

semantra by freedmand

rag-search by thinkany-ai

nextjs-openai-doc-search by supabase-community