KlicStudio  by krillinai

Video tool for translation and dubbing using LLMs

created 7 months ago
8,186 stars

Top 6.4% on sourcepulse

GitHubView on GitHub
Project Summary

Krillin AI is an open-source, AI-powered video localization tool designed for content creators and publishers. It automates the entire video translation and dubbing workflow, from transcription and translation to voice cloning and platform-specific formatting, enabling rapid creation of multilingual, engaging video content.

How It Works

The tool leverages a modular architecture, integrating various AI models for core functionalities. Speech recognition is handled by OpenAI Whisper or FasterWhisper (local execution), with options for WhisperKit on macOS. Large Language Models (LLMs) from OpenAI, DeepSeek, Qwen, or self-hosted alternatives process translations and subtitle segmentation. Dubbing utilizes CosyVoice or voice cloning. The system supports automatic video reformatting for landscape and portrait orientations, optimizing content for platforms like YouTube, TikTok, and Bilibili.

Quick Start & Requirements

  • Installation: Download release executable for your OS. Desktop versions can be run directly; non-desktop versions require a config.toml file and are run via terminal. Docker deployment is also supported.
  • Prerequisites: macOS users may need to manually trust executables. Configuration requires API keys for chosen LLM and speech recognition services (e.g., OpenAI, Alibaba Cloud).
  • Resources: Local Whisper models require disk space. Performance depends on hardware and chosen models.
  • Links: Desktop Version Instructions, Non-Desktop Version Instructions, Docker Deployment, Configuration Help.

Highlighted Details

  • End-to-end workflow: video download, transcription, translation, dubbing, and formatting.
  • Supports local LLM and speech recognition models for cost and privacy control.
  • Automatic video composition for landscape/portrait formats.
  • Extensive language support for input (10+ languages) and translation (101 languages).

Maintenance & Community

  • Active development with a new desktop version released.
  • Community channels include Discord and QQ groups.
  • Links to Twitter and Bilibili presence are provided.

Licensing & Compatibility

  • The repository's license is not explicitly stated in the README.

Limitations & Caveats

  • The desktop version for macOS may require manual trust configuration due to signing issues.
  • Local speech recognition is not supported on macOS.
  • The README mentions the desktop version "still has some bugs and is being continuously updated."
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
11
Issues (30d)
17
Star History
1,653 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.