arcana  by georgeguimaraes

RAG library for Elixir/Phoenix with agentic pipelines

Created 5 months ago
307 stars

Top 87.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Arcana is an embeddable Retrieval-Augmented Generation (RAG) library for Elixir/Phoenix applications, enabling developers to integrate vector search, document retrieval, and AI Q&A directly into their projects. It supports both simple RAG and sophisticated agentic pipelines, offering a flexible solution for enhancing application intelligence.

How It Works

Arcana facilitates RAG via a configurable pipeline: text is chunked, embedded (local Bumblebee, OpenAI), and stored in swappable vector backends (pgvector, HNSWLib). Search uses semantic, full-text, or hybrid modes with Reciprocal Rank Fusion. Its Agentic RAG orchestrates complex Q&A through retrieval gating, query expansion, decomposition, multi-hop reasoning, and re-ranking. Optional GraphRAG builds knowledge graphs for entity extraction, community detection, and fusion search.

Quick Start & Requirements

Installation uses mix igniter.install arcana or manual dependency addition (mix arcana.install, mix ecto.migrate). A PostgreSQL database with pgvector is required. Local embeddings necessitate an Nx backend (EXLA, EMLX, Torchx) and its dependency. Default PDF ingestion requires system-installed Poppler utilities. Official guides are available.

Highlighted Details

  • Agentic Pipelines: Advanced RAG orchestration with retrieval gating, query expansion, decomposition, multi-hop reasoning, and re-ranking.
  • GraphRAG: Knowledge graph integration with entity extraction, community detection, and fusion search.
  • Hybrid Search: Combines vector and full-text search using Reciprocal Rank Fusion.
  • Pluggable Architecture: All components (chunkers, embedders, agent steps) are swappable.
  • LiveView Dashboard: Optional UI for document management and search.
  • Grounding: Hallucination detection via Hallmark NLI model.
  • Flexible Backends: Supports pgvector/HNSWLib for vectors, local/OpenAI for embeddings.

Maintenance & Community

The README does not detail specific maintenance contributors, community channels (e.g., Discord, Slack), or sponsorship information.

Licensing & Compatibility

Arcana is licensed under the permissive Apache License 2.0, allowing broad compatibility, including commercial use and integration into closed-source applications.

Limitations & Caveats

The roadmap indicates planned features like additional vector store backends (ChromaDB, TurboPuffer) and asynchronous ingestion (Oban) are not yet implemented. Default PDF parsing relies on system-level Poppler installation.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.1%
15k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 3 weeks ago
Feedback? Help us improve.