clipify  by louisedesadeleer

Video to social clips generator

Created 2 weeks ago

New!

365 stars

Top 77.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary Clipify addresses the challenge of efficiently creating short, engaging video clips from long-form content for social media platforms. Targeting content creators, podcasters, and interviewers, it automates the discovery of key moments, reframing to vertical formats, and adding stylized captions, offering a fast, local alternative to expensive SaaS solutions.

How It Works The skill leverages Whisper for video transcription, then analyzes the text for "clip-worthy" segments based on linguistic cues and audio peaks. It employs a novel, ffmpeg-based face-tracking mechanism that uses motion energy detection on speaker regions to dynamically crop and pan a 16:9 source to a 9:16 aspect ratio, or uses split-screen. Captions are burned in an "opus-style" format, highlighting the active word. This approach prioritizes local processing, eliminating cloud dependencies and accelerating turnaround times.

Quick Start & Requirements Installation involves cloning the repository into the Claude Code skills directory: git clone https://github.com/louisedesadeleer/clipify.git ~/.claude/skills/clipify. Prerequisites include macOS (for optimal VideoToolbox hardware acceleration, though adaptable), Claude Code, ffmpeg with libx264 (brew install ffmpeg), openai-whisper (pip install openai-whisper), and Python 3 with numpy (pip install numpy). Usage is via the /clipify command within Claude Code, followed by prompts for video selection, clip choice, aspect ratio, reframing style, and caption appearance. Final clips are saved to <source-video-dir>/clipify_out/.

Highlighted Details

  • Automated identification of engaging segments through transcript analysis.
  • Dynamic 9:16 reframing with speaker-following hard-cut pans or split-screen.
  • "Opus-style" word-by-word caption generation.
  • Fully local execution, ensuring data privacy and zero cloud costs.
  • Rapid clip generation, reportedly ~20 seconds for a 20-second clip on Apple Silicon.

Maintenance & Community The provided README does not detail specific contributors, community channels (like Discord or Slack), sponsorships, or a public roadmap.

Licensing & Compatibility The project is released under the MIT license, which permits broad usage, including commercial applications and integration into closed-source projects. Hardware acceleration is optimized for macOS VideoToolbox, requiring modifications for Linux/Windows.

Limitations & Caveats The primary hardware acceleration path is macOS-specific; cross-platform use necessitates configuration changes. The tool is optimized for talking-head dialogue formats like interviews and podcasts, and its face-tracking relies on motion heuristics rather than advanced face detection models.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
367 stars in the last 18 days

Explore Similar Projects

Feedback? Help us improve.