Discover and explore top open-source AI tools and projects—updated daily.
EternityJune25Cognitive RAG for stateful long narrative reasoning
Top 89.4% on SourcePulse
ComoRAG is a Retrieval-Augmented Generation (RAG) framework engineered for complex, stateful reasoning over long documents and multi-document collections. It targets researchers and practitioners needing advanced capabilities in question answering, information extraction, and knowledge graph construction from extensive narratives. ComoRAG offers significant performance gains on challenging long-context benchmarks by adopting a cognitive-inspired, iterative reasoning approach that mimics human memory processes.
How It Works
ComoRAG addresses the limitations of stateless, single-step RAG systems in handling intricate, long-range narrative comprehension. Its core innovation lies in a cognition-inspired methodology that models narrative reasoning as a dynamic interplay between acquiring new evidence and consolidating past knowledge. This is achieved through iterative reasoning cycles, where the system generates targeted "probing queries" to explore new evidence paths. Retrieved information is integrated into a "global memory pool," progressively building a coherent context for the query. This "Reason → Probe → Retrieve → Consolidate → Resolve" cycle enables principled, stateful retrieval-based reasoning, outperforming strong RAG baselines by up to 11% on benchmarks exceeding 200K tokens.
Quick Start & Requirements
pip install -r requirements.txt.corpus.jsonl (documents) and qas.jsonl (questions).main_openai.py for OpenAI API integration or main_vllm.py for local vLLM server deployment.Highlighted Details
Maintenance & Community
The project is marked as active. Contributions are welcomed via Issues or Pull Requests. No specific community channels (like Discord/Slack) or notable contributors/sponsorships are listed in the README.
Licensing & Compatibility
The project is released under the MIT License, which generally permits broad use, modification, and distribution, including for commercial purposes, with minimal restrictions.
Limitations & Caveats
A known issue exists with remote embedding models deployed via vLLM, where missing local tokenizer files can cause failures. Planned features include support for additional embedding model providers.
2 months ago
Inactive
xlang-ai