Discover and explore top open-source AI tools and projects—updated daily.
Jennyee1Multimodal academic agent for scholarly research and knowledge synthesis
Top 82.0% on SourcePulse
ScholarMind is a multimodal academic research agent designed for the LLM Agent domain. It assists users with tasks such as paper retrieval, PDF and chart comprehension, knowledge graph construction, learning path planning, and code reproduction. The agent integrates with MCP hosts, offering a streamlined workflow for researchers and developers seeking to manage and leverage academic knowledge efficiently.
How It Works
ScholarMind operates as a plugin within MCP (Model Context Protocol) hosts, leveraging a modular architecture. It employs dual-source search (Semantic Scholar, arXiv) and a PDF parser using PyMuPDF in generator mode to prevent Out-of-Memory errors. Core functionalities include multimodal understanding of figures and formulas, automatic knowledge graph construction constrained by Pydantic schemas and stored in NetworkX with a time dimension, and learning path planning informed by PageRank-based knowledge gap analysis. Code reproduction is handled via a secure subprocess sandbox.
Quick Start & Requirements
Installation involves cloning the repository, installing dependencies via pip install -r requirements.txt, and running the python install.py script for automated setup and configuration generation. Users must configure API keys (e.g., MINIMAX_API_KEY) in a .env file and merge the generated mcp_config.json into their MCP host's configuration (e.g., ~/.gemini/antigravity/mcp_config.json or ~/.claude/mcp_config.json).
Highlighted Details
subprocess sandbox.Maintenance & Community
The provided README does not contain specific details regarding notable contributors, sponsorships, partnerships, or community channels (e.g., Discord, Slack).
Licensing & Compatibility
The project is released under the MIT license, which permits broad usage, including commercial applications and integration with closed-source projects.
Limitations & Caveats
The README does not explicitly detail known limitations, alpha status, or specific performance benchmarks. While multimodal understanding includes token control, specific performance trade-offs or unsupported modalities are not elaborated upon.
1 week ago
Inactive
safishamsi