Anagnorisis  by volotat

Personalized local data management and recommendation engine

Created 2 years ago
301 stars

Top 88.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary Anagnorisis is a local-first data management platform featuring a private, trainable recommendation engine. It addresses data privacy concerns by processing all user data locally, enabling personalized information filtering and recommendations. The system aims to create a user's "digital twin" for efficient navigation of online content.

How It Works Users rate local data (text, audio, images, video). Recommendation models fine-tune on these ratings. A recent shift (v0.3.4+) uses a unified omni-descriptor model (e.g., MiniCPM-o-4_5) to convert all modalities into text descriptions. Search and recommendations operate on these text embeddings, promoting consistent, scalable cross-modal experiences and simplifying future development towards a single preference model.

Quick Start & Requirements Docker is the preferred installation. Clone the repo, configure docker-compose.override.yaml with absolute paths for data/config, and run docker compose up -d. Prerequisites: Docker, NVIDIA Container Toolkit/WSL2 for GPU. Recommended: Nvidia GPU (8GB VRAM), 32GB RAM. Container storage: ~45GB.

Highlighted Details

  • Local-First Data Privacy: All data processing and storage are strictly local.
  • Trainable Recommendation Engine: Models fine-tune on user ratings for personalized outputs.
  • Unified Omni-Descriptor Model: Recent focus on a single model for cross-modal text embedding, unifying search and recommendations.
  • Multi-Instance Support: Enables running multiple independent Anagnorisis instances.

Maintenance & Community Details on active maintenance, core contributors, or community channels (Discord, Slack) are absent. The project appears individually or small-team maintained, with a roadmap accessible via its wiki.

Licensing & Compatibility The specific open-source license is not stated. The platform defaults to localhost access only, with warnings against internet exposure due to incomplete security measures.

Limitations & Caveats The ongoing architectural shift to the omni-descriptor model may introduce performance regressions (slower, more demanding) and initial accuracy issues. Significant storage (~45GB) and hardware (recommended 8GB VRAM GPU, 32GB RAM) are required. Security for external access is incomplete. Initial setup involves lengthy model downloads and data caching. The license remains unspecified.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.