filewizard  by LoredCast

Web UI for versatile file processing, OCR, and speech-to-text

Created 1 month ago
556 stars

Top 57.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary File Wizard is a self-hosted, browser-based utility for file conversion, OCR, and audio transcription. It targets users needing a versatile local processing solution, wrapping common CLI tools and Python libraries into a unified, responsive dark UI. Features include drag-and-drop, background job processing with real-time status, and extensibility.

How It Works The project wraps established CLI utilities (FFmpeg, LibreOffice, Pandoc) and Python libraries (faster-whisper, Tesseract OCR) within a FastAPI backend, vanilla JS/CSS frontend, Huey task queue, and SQLite storage. Its core advantage is extensibility via settings.yml for custom CLI tools. It defaults to CPU processing but offers a CUDA-enabled Docker image for GPU acceleration.

Quick Start & Requirements Docker installation is recommended using images like loredcast/filewizard:0.3-latest or loredcast/filewizard:0.3-cuda, deployed via docker-compose.yml. Local builds require cloning and docker compose up --build, which can be slow due to dependencies like TeX. Manual setup involves Python virtual environments, pip install -r requirements.txt, and running ./run.sh. Prerequisites include Docker, Python 3.x, and optionally NVIDIA GPU/CUDA for the -cuda image. Docs are on the Wiki: https://github.com/LoredCast/filewizard/wiki.

Highlighted Details

  • Extensible architecture integrates any CLI tool via settings.yml.
  • Broad format support via LibreOffice, Pandoc, FFmpeg, Calibre, Tesseract OCR, faster-whisper.
  • Includes OCR and audio transcription capabilities.
  • Features background job processing with persistent history and real-time status.
  • Offers both CPU-default and optional CUDA-enabled Docker images.

Maintenance & Community The provided README does not detail specific community channels, notable contributors, sponsorships, or a public roadmap. Information is primarily derived from the repository's README and Wiki.

Licensing & Compatibility The license type is not specified in the provided README text. This omission requires further investigation before commercial use or integration into closed-source projects.

Limitations & Caveats A significant security warning states that public exposure without authentication risks arbitrary code execution; it's intended for local use or behind a secure authentication layer. Building from source can be time-consuming due to large dependencies. Conversion fidelity may vary for complex layouts, and format support depends on tool build configurations.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
7
Star History
241 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.