mad-professor-public  by LYiHub

AI companion for reading papers with a "grumpy professor" persona

created 3 months ago
1,536 stars

Top 27.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered desktop application designed to enhance academic paper reading efficiency for researchers. It offers features like PDF processing, AI translation, RAG-based Q&A, and voice interaction, all delivered through a unique "mad professor" persona for a more engaging experience.

How It Works

The application employs a multi-stage pipeline: PDF ingestion and parsing (via magic-pdf), content translation, structuring, and embedding for RAG. A PyQt6 frontend provides a split-pane interface for viewing documents and interacting with the AI. The core AI functionality leverages LLMs for Q&A and includes speech recognition (Whisper) and TTS for voice interaction, with RAG enhancing retrieval accuracy.

Quick Start & Requirements

  • Install: Use conda to create an environment, then pip install dependencies.
  • Prerequisites: Python 3.10+, CUDA support with >6GB VRAM, faiss-gpu (via conda), numpy<=2.1.1.
  • API Keys: Requires API keys for DeepSeek and MiniMax for LLM and TTS services, configurable in config.py.
  • Models: A download_models.py script handles model downloads.
  • Docs: MinerU, RealtimeSTT, DeepSeek, MiniMax.

Highlighted Details

  • Features a "mad professor" persona with customizable prompts and voice.
  • Supports bilingual (Chinese/English) paper viewing with side-by-side display.
  • Integrates RAG for precise retrieval and context-aware AI questioning.
  • Offers voice input and TTS output for interactive Q&A.

Maintenance & Community

The project explicitly thanks the MinerU and RealtimeSTT projects. Customization of persona and voice requires manual code modification.

Licensing & Compatibility

Licensed under the Apache License. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The application is primarily designed for structured academic PDFs; unstructured documents may cause errors. Concurrent audio input device switching and AI voice feedback loops (if not using headphones) are noted issues.

Health Check
Last commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
291 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.