mad-professor-public by LYiHub

AI companion for reading papers with a "grumpy professor" persona

Created 8 months ago

1,575 stars

Top 26.4% on SourcePulse

Project Summary

This project provides an AI-powered desktop application designed to enhance academic paper reading efficiency for researchers. It offers features like PDF processing, AI translation, RAG-based Q&A, and voice interaction, all delivered through a unique "mad professor" persona for a more engaging experience.

How It Works

The application employs a multi-stage pipeline: PDF ingestion and parsing (via magic-pdf), content translation, structuring, and embedding for RAG. A PyQt6 frontend provides a split-pane interface for viewing documents and interacting with the AI. The core AI functionality leverages LLMs for Q&A and includes speech recognition (Whisper) and TTS for voice interaction, with RAG enhancing retrieval accuracy.

Quick Start & Requirements

Install: Use conda to create an environment, then pip install dependencies.
Prerequisites: Python 3.10+, CUDA support with >6GB VRAM, faiss-gpu (via conda), numpy<=2.1.1.
API Keys: Requires API keys for DeepSeek and MiniMax for LLM and TTS services, configurable in config.py.
Models: A download_models.py script handles model downloads.
Docs: MinerU, RealtimeSTT, DeepSeek, MiniMax.

Highlighted Details

Features a "mad professor" persona with customizable prompts and voice.
Supports bilingual (Chinese/English) paper viewing with side-by-side display.
Integrates RAG for precise retrieval and context-aware AI questioning.
Offers voice input and TTS output for interactive Q&A.

Maintenance & Community

The project explicitly thanks the MinerU and RealtimeSTT projects. Customization of persona and voice requires manual code modification.

Licensing & Compatibility

Licensed under the Apache License. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The application is primarily designed for structured academic PDFs; unstructured documents may cause errors. Concurrent audio input device switching and AI voice feedback loops (if not using headphones) are noted issues.

mad-professor-public by LYiHub

Explore Similar Projects

paper_to_podcast by Azzedde

rag-chatbot by datvodinh

language-models by piegu

HunyuanOCR by Tencent-Hunyuan

vits-simple-api by Artrajz

ollama-playground by NarimanN2

deepdoctection by deepdoctection

awesome-bangla by banglakit

pororo by kakaobrain

pdf-to-podcast by knowsuchagency

FunASR by modelscope

docling by docling-project