edit-mind  by IliasHad

AI video indexing and semantic search desktop app

Created 2 weeks ago

New!

556 stars

Top 57.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Edit Mind is a cross-platform desktop app indexing video libraries with local AI for deep metadata (transcriptions, faces, objects, text). It enables semantic search via natural language and generates AI-assisted rough cuts, transforming video discovery into a searchable, offline-first database for editors and researchers.

How It Works

A local AI pipeline analyzes videos: audio transcription (Whisper), 2-second scene segmentation, and deep frame analysis (faces, objects, OCR, colors) via Python plugins. Data is timestamped, embedded (Google Text Embedding Models), and stored in ChromaDB. Natural language search queries are parsed by Google Gemini 2.5 Pro into JSON for vector store retrieval. This privacy-first approach keeps data local, using cloud only for Gemini API query interpretation.

Quick Start & Requirements

  • Prerequisites: Node.js v22+, Python v3.9+. Recommended: multi-core CPU, modern GPU, 8GB+ RAM.
  • Installation: Clone repo, npm install, setup Python env (pip install -r requirements.txt, chromadb), start ChromaDB. Configure GEMINI_API_KEY in .env.
  • Running: npm run start.
  • Links: Project repository: https://github.com/iliashad/edit-mind.

Highlighted Details

  • Privacy-First: 100% local AI processing; no raw video uploaded.
  • Deep Indexing: Extracts transcriptions, faces, objects, on-screen text, dominant colors.
  • Semantic Search: Natural language queries for content discovery.
  • AI-Generated Rough Cuts: Assemble clips from natural language descriptions.
  • Cross-Platform: macOS, Windows, Linux (Electron).
  • Plugin Architecture: Extensible analysis via Python plugins.

Maintenance & Community

Actively in development and not production-ready, Edit Mind welcomes contributors. The roadmap includes advanced search, export formats, new plugins, and future cloud sync. No specific community channels are listed.

Licensing & Compatibility

MIT License, permissive for commercial use and closed-source integration.

Limitations & Caveats

This project is in early development, exhibiting incomplete features and bugs. Packaging for end-users requires significant effort, and performance on consumer hardware is resource-intensive, necessitating optimizations. Evolving metadata schemas present a long-term data migration challenge.

Health Check
Last Commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
560 stars in the last 15 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

LAVIS by salesforce

0.1%
11k
Library for language-vision AI research
Created 3 years ago
Updated 11 months ago
Feedback? Help us improve.