NBLM2PPTX  by laihenyi

AI-powered conversion of NotebookLM PDFs to editable PPTX

Created 2 months ago
291 stars

Top 90.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project converts NotebookLM exported PDFs into editable PPTX presentations, separating background images from text layers. It addresses the need for repurposing NotebookLM content, enabling users to easily edit and integrate notes into standard presentation workflows.

How It Works

The tool employs Gemini 2.5 Flash for AI-driven text removal and OCR on image-based PDFs, while utilizing PDF.js for precise, local text extraction from native PDF text. This hybrid approach, combined with parallel processing of text removal and OCR, significantly reduces conversion time. A dual-mode OCR system offers flexibility: a faster 'Lite' model for uniform styling or a 'Standard' model for full font fidelity, balancing speed and visual accuracy.

Quick Start & Requirements

  • Primary Install/Run: Open the HTML file in a browser (Chrome/Edge recommended).
  • Prerequisites: Google account for a free API Key via Google AI Studio (aistudio.google.com/apikey). No credit card required.
  • Dependencies: Stable internet for API calls.
  • Resource Footprint: Minimal client-side. API usage subject to Google Gemini's free tier (15 RPM, 1500 RPD).
  • Links: Google AI Studio.

Highlighted Details

  • Dual-Mode OCR: 'Lite' (faster, 50% API quota saving, uniform styling) vs. 'Standard' (full font style detection) using Gemini 2.5 Flash.
  • Parallel Processing: Text removal and OCR run concurrently, reducing page processing to ~2-3 seconds.
  • Hybrid Text Extraction: PDF.js for PDFs, Gemini OCR for images, ensuring precise text positioning.
  • Layered PPTX Output: Slides contain clean background images and separate, editable text boxes for easy content modification.

Maintenance & Community

Last updated v2.3 on January 21, 2026, with an i18n overhaul and design updates. No community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

  • License Type: MIT License.
  • Compatibility: Permissive MIT license allows commercial use and integration into closed-source projects.

Limitations & Caveats

Requires internet for Gemini API calls. 'Lite' OCR mode sacrifices font style fidelity. Text removal may occasionally be incomplete. Optimized for Chrome/Edge.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
64 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.