kindle-ai-export by transitive-bullshit

Kindle book export and AI audiobook generation

Created 1 year ago

293 stars

Top 90.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Travis Fischer

Founder of Agentic

Project Summary

Kindle AI Export provides a method to extract content from Kindle books into various formats, including text, PDF, EPUB, and custom AI-narrated audiobooks. It targets Kindle users who wish to repurpose their owned digital library for personal AI experiments, accessibility, or alternative consumption methods, offering a solution to bypass Digital Rights Management (DRM) and create personalized audio content.

How It Works

The project leverages Playwright to automate interaction with the Kindle web reader. It logs into the user's account, navigates through each page of a selected book, and captures screenshots. These images are then processed by a large language model (LLM), defaulting to gpt-4.1-mini, for Optical Character Recognition (OCR) to extract the text content. Once the text is extracted and compiled, it can be converted into various file formats. For audiobook generation, Text-to-Speech (TTS) engines like OpenAI TTS and Unreal Speech TTS are employed.

Quick Start & Requirements

Primary install/run command: Requires Node.js (>=18) and pnpm. Clone the repository, run pnpm install, set up a .env file with AMAZON_EMAIL, AMAZON_PASSWORD, ASIN, and OPENAI_API_KEY. Execution involves commands like npx tsx src/extract-kindle-book.ts.
Non-default prerequisites: An Amazon Kindle account with owned books, an OpenAI API key, and ffmpeg installed locally for audiobook concatenation.
Links: No direct quick-start or demo links are provided, but usage commands are detailed in the README.

Highlighted Details

Supports export to text, PDF, EPUB, Markdown, and AI-narrated audiobooks.
Bypasses Kindle DRM and API limitations by using browser automation and OCR.
Offers choice between OpenAI TTS (higher quality, higher cost) and Unreal Speech TTS (medium quality, lower cost) for audiobook generation.
Aims to preserve Kindle's original sync positions for seamless transitions between reading and listening.

Maintenance & Community

The README does not mention specific contributors, community channels (like Discord/Slack), or a roadmap.

Licensing & Compatibility

The project is released under the MIT License. While permissive for commercial use, the author requests that exported content not be shared publicly to respect author and artist compensation.

Limitations & Caveats

Transcription accuracy can have occasional issues, particularly with whitespace and paragraph differentiation. The use of LLMs for OCR incurs costs per book, though this is expected to decrease with advancements in local LLMs. Embedded images from Kindle books are not supported. The process may fail on virtual machines if WebGL rendering is required for page content.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days