zchunk by zeroentropy-ai

Novel LLM-powered chunking for RAG

Created 1 year ago

261 stars

Top 97.2% on SourcePulse

4 Experts Love This Project

gregpr07

Cofounder of Browser Use

calcsam

Cofounder of Mastra, Gatsby

abhiaiyer91

Cofounder of Mastra

smthomas

Cofounder of Mastra

Project Summary

Summary This project addresses the significant challenge of effective document chunking for Retrieval Augmented Generation (RAG) applications. It introduces zChunk, a novel strategy that leverages Llama 3.1 70B to automatically segment documents into semantically coherent chunks, aiming to improve retrieval accuracy and signal-to-noise ratios. zChunk offers a robust, out-of-the-box solution for RAG preprocessing, reducing the need for extensive manual tuning and custom regex development.

How It Works zChunk employs a prompt-based approach where Llama 3.1 70B is instructed to insert a special, non-corpus token (e.g., "段") at semantically meaningful boundaries within a document. This method bypasses the brittleness of regex-based splitting and the limitations of fixed-size or purely embedding-similarity-based chunking. For enhanced efficiency, zChunk utilizes low-level access to the LLM's log probabilities to identify optimal chunking points without generating full output tokens, significantly reducing inference latency. This optimization is crucial for processing large documents rapidly.

Quick Start & Requirements

Primary Install/Run: Not explicitly detailed. Requires local inference setup for Llama 3.1 70B.
Prerequisites: Llama 3.1 70B model, Python, tiktoken library. GPU acceleration is implied by benchmark performance (A100).
Links: No official quick-start, demo, or documentation links are provided in the README.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

0 stars in the last 30 days

Explore Similar Projects

RAG-QA-Generator by wangxb96

Automated RAG knowledge base generation and management tool

Created 2 years ago

Updated 1 year ago

adaptive-chunking by ekimetrics

Adaptive chunking for optimized RAG document processing

Created 3 months ago

Updated 5 days ago

rag-all-in-one by lehoanglong95

Mastering Retrieval-Augmented Generation (RAG) applications

Created 1 year ago

Updated 1 month ago

Starred by

Simon Willison

Simon Willison(Coauthor of Django) and

Anton Troynikov

Anton Troynikov(Cofounder of Chroma).

chunking_evaluation by brandonstarxel

SDK for text chunking and evaluation research

Created 2 years ago

Updated 7 months ago

MasteringRAG by Steven-Luo

LLM-based RAG system for enterprise document Q&A

Created 2 years ago

Updated 1 year ago

Starred by

Andreas Jansson

Andreas Jansson(Cofounder of Replicate).

dsRAG by D-Star-AI

RAG engine for unstructured data, excelling on dense text QA

Created 2 years ago

Updated 8 months ago

advanced-rag by guyernest

Advanced RAG for robust LLM applications

Created 1 year ago

Updated 1 year ago

MiniRAG by HKUDS

RAG framework for small language models

Created 1 year ago

Updated 8 months ago

all-rag-techniques by FareedKhan-dev

Jupyter notebooks for RAG technique implementations

Created 1 year ago

Updated 1 year ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI),

Carol Willing

Carol Willing(Core Contributor to CPython, Jupyter), and

2 more.

chonkie by feyninc

Chunking library for RAG applications

Created 1 year ago

Updated 4 days ago

Starred by

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind),

Yiran Wu

Yiran Wu(Coauthor of AutoGen), and

2 more.

RAG_Techniques by NirDiamant

RAG techniques showcase for enhanced generation systems

Created 2 years ago

Updated 1 week ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Rodrigo Nader

Rodrigo Nader(Cofounder of Langflow), and

9 more.

ragflow by infiniflow

Open-source RAG engine for deep document understanding

Created 2 years ago

Updated 19 hours ago

Feedback? Help us improve.