second-brain-agent  by flepied

AI agent for intelligent personal knowledge management

Created 2 years ago
269 stars

Top 95.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project provides an AI-powered "Second Brain" agent designed to automate the indexing, searching, and interaction with personal knowledge bases, primarily composed of markdown files and their linked content. It aims to enhance productivity for professionals, students, and researchers by enabling natural language querying of vast amounts of personal data, inspired by Tiago Forte's Second Brain concept.

How It Works

The system automatically indexes markdown files and extracts text from linked PDFs, YouTube videos, and web pages. Content is chunked, converted into vector embeddings, and stored in a ChromaDB vector store. Leveraging Retrieval-Augmented Generation (RAG), the agent analyzes user questions to determine intent, routing queries through specialized chains for summarization, content lookup, activity reports (using date metadata), or general Q&A, all powered by LangChain and an OpenAI LLM.

Quick Start & Requirements

  • Prerequisites: Python 3, poetry, inotify-tools. Tested on Fedora Linux 42 and Ubuntu.
  • Installation: Clone the repository, copy example.env to .env and configure, then run poetry install. A workaround poetry run pip install torch is needed due to a poetry/torch/pypi bug.
  • Run Web UI: streamlit run second_brain_agent.py
  • Systemd Services: ./install-systemd-services.sh for automatic script management.
  • Docs/Demo: Web UI provides interactive access.

Highlighted Details

  • Automated indexing of markdown, PDFs, YouTube videos, and web pages.
  • Intent-based query processing for summaries, specific lookups, activity reports, and general questions.
  • MCP (Model Context Protocol) Server offers programmatic API access to the vector database and document retrieval system for external integrations.
  • Domain classification for documents (e.g., "Workout" from "WorkoutHistory202412.md") and support for date-based history extraction.

Maintenance & Community

The README does not provide specific details regarding notable contributors, sponsorships, or community channels like Discord or Slack.

Licensing & Compatibility

The project's license is not explicitly stated in the README, which may pose compatibility concerns for commercial or closed-source use.

Limitations & Caveats

The system is primarily tested on Fedora and Ubuntu, with potential compatibility variations on other operating systems. A specific installation workaround for torch is required. Integration tests necessitate Docker/Podman and a running vector database.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.