arcana  by georgeguimaraes

RAG library for Elixir/Phoenix with agentic pipelines

Created 2 months ago
266 stars

Top 96.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Arcana is an embeddable Retrieval-Augmented Generation (RAG) library for Elixir/Phoenix applications, enabling developers to integrate vector search, document retrieval, and AI Q&A directly into their projects. It supports both simple RAG and sophisticated agentic pipelines, offering a flexible solution for enhancing application intelligence.

How It Works

Arcana facilitates RAG via a configurable pipeline: text is chunked, embedded (local Bumblebee, OpenAI), and stored in swappable vector backends (pgvector, HNSWLib). Search uses semantic, full-text, or hybrid modes with Reciprocal Rank Fusion. Its Agentic RAG orchestrates complex Q&A through retrieval gating, query expansion, decomposition, multi-hop reasoning, and re-ranking. Optional GraphRAG builds knowledge graphs for entity extraction, community detection, and fusion search.

Quick Start & Requirements

Installation uses mix igniter.install arcana or manual dependency addition (mix arcana.install, mix ecto.migrate). A PostgreSQL database with pgvector is required. Local embeddings necessitate an Nx backend (EXLA, EMLX, Torchx) and its dependency. Default PDF ingestion requires system-installed Poppler utilities. Official guides are available.

Highlighted Details

  • Agentic Pipelines: Advanced RAG orchestration with retrieval gating, query expansion, decomposition, multi-hop reasoning, and re-ranking.
  • GraphRAG: Knowledge graph integration with entity extraction, community detection, and fusion search.
  • Hybrid Search: Combines vector and full-text search using Reciprocal Rank Fusion.
  • Pluggable Architecture: All components (chunkers, embedders, agent steps) are swappable.
  • LiveView Dashboard: Optional UI for document management and search.
  • Grounding: Hallucination detection via Hallmark NLI model.
  • Flexible Backends: Supports pgvector/HNSWLib for vectors, local/OpenAI for embeddings.

Maintenance & Community

The README does not detail specific maintenance contributors, community channels (e.g., Discord, Slack), or sponsorship information.

Licensing & Compatibility

Arcana is licensed under the permissive Apache License 2.0, allowing broad compatibility, including commercial use and integration into closed-source applications.

Limitations & Caveats

The roadmap indicates planned features like additional vector store backends (ChromaDB, TurboPuffer) and asynchronous ingestion (Oban) are not yet implemented. Default PDF parsing relies on system-level Poppler installation.

Health Check
Last Commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
18
Issues (30d)
1
Star History
52 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.1%
15k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 2 weeks ago
Feedback? Help us improve.