fojin  by xr843

Encyclopedic Buddhist digital text platform

Created 2 months ago
302 stars

Top 88.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

FoJin is a comprehensive Buddhist digital text platform addressing the fragmentation of Buddhist heritage across global databases. It offers researchers and practitioners a unified interface for accessing over 23,500 full-text volumes from 503 sources in 30 languages, enhanced by AI-powered Q&A, a knowledge graph, and advanced search. The platform significantly reduces text discovery time, enabling deeper engagement with Buddhist studies.

How It Works

FoJin aggregates diverse Buddhist digital heritage from 503 sources using PostgreSQL with pgvector for semantic search (HNSW index) and Elasticsearch for keyword search. Its backend is built with FastAPI. Key features include AI Q&A ("XiaoJin") via RAG with multiple LLM providers and tradition-scoped "Master Personas," a knowledge graph visualizing 31K+ entities and 28K+ relations, and a geo-map of 50K+ entities. Data is imported via Python scripts; text content is fetched from original sources, not bundled.

Quick Start & Requirements

Installation uses Docker Compose (docker compose up -d). Prerequisites include Docker and a configured .env file. Initial setup requires importing text content via provided Python scripts (e.g., import_content.py). Dependencies include PostgreSQL, Elasticsearch, and Redis.

Highlighted Details

  • Aggregates 23,500+ full-text volumes from 503 sources across 30 languages.
  • AI Q&A ("XiaoJin") with 8 historical Buddhist master personas, RAG, and clickable citations.
  • Knowledge graph with 31K+ entities and 28K+ relations, visualized on an interactive 50K-entity geo map.
  • Supports parallel reading in 30 languages and offers 32 dictionaries with 748K+ entries.

Maintenance & Community

Active maintenance is indicated by CI/CD badges. Community engagement is facilitated via GitHub Discussions and Discord. Bug reporting and contributions are managed through GitHub Issues and a CONTRIBUTING.md file.

Licensing & Compatibility

The FoJin source code is Apache 2.0 licensed. However, integrated third-party data sources retain their own licenses (e.g., CC BY-NC-SA, CC0). Users must consult the NOTICE file for specific data license details. Commercial use compatibility depends on these varied data licenses.

Limitations & Caveats

Initial setup requires a separate, potentially time-consuming, data import process for text content. Cross-lingual search is a planned future feature. Commercial usability is contingent on the licenses of the numerous third-party data sources.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
129
Issues (30d)
0
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Simon Horup Eskildsen Simon Horup Eskildsen(Cofounder of Turbopuffer), and
21 more.

meilisearch by meilisearch

0.2%
58k
Search engine API for integrating AI-powered hybrid search
Created 8 years ago
Updated 12 hours ago
Feedback? Help us improve.