Discover and explore top open-source AI tools and projects—updated daily.
LoredCastWeb UI for versatile file processing, OCR, and speech-to-text
Top 57.6% on SourcePulse
Summary File Wizard is a self-hosted, browser-based utility for file conversion, OCR, and audio transcription. It targets users needing a versatile local processing solution, wrapping common CLI tools and Python libraries into a unified, responsive dark UI. Features include drag-and-drop, background job processing with real-time status, and extensibility.
How It Works
The project wraps established CLI utilities (FFmpeg, LibreOffice, Pandoc) and Python libraries (faster-whisper, Tesseract OCR) within a FastAPI backend, vanilla JS/CSS frontend, Huey task queue, and SQLite storage. Its core advantage is extensibility via settings.yml for custom CLI tools. It defaults to CPU processing but offers a CUDA-enabled Docker image for GPU acceleration.
Quick Start & Requirements
Docker installation is recommended using images like loredcast/filewizard:0.3-latest or loredcast/filewizard:0.3-cuda, deployed via docker-compose.yml. Local builds require cloning and docker compose up --build, which can be slow due to dependencies like TeX. Manual setup involves Python virtual environments, pip install -r requirements.txt, and running ./run.sh. Prerequisites include Docker, Python 3.x, and optionally NVIDIA GPU/CUDA for the -cuda image. Docs are on the Wiki: https://github.com/LoredCast/filewizard/wiki.
Highlighted Details
settings.yml.Maintenance & Community The provided README does not detail specific community channels, notable contributors, sponsorships, or a public roadmap. Information is primarily derived from the repository's README and Wiki.
Licensing & Compatibility The license type is not specified in the provided README text. This omission requires further investigation before commercial use or integration into closed-source projects.
Limitations & Caveats A significant security warning states that public exposure without authentication risks arbitrary code execution; it's intended for local use or behind a secure authentication layer. Building from source can be time-consuming due to large dependencies. Conversion fidelity may vary for complex layouts, and format support depends on tool build configurations.
5 days ago
Inactive