Discover and explore top open-source AI tools and projects—updated daily.
KakaruHayateVocal timbre classification CLI tool
Top 98.4% on SourcePulse
ColorSplitter is a CLI tool for pre-processing single-speaker vocal datasets by classifying and separating vocal timbres. It addresses unstable timbre performance in AI voice models by filtering training data, benefiting researchers and developers in voice synthesis. The tool offers advanced capabilities like automatic clustering optimization and emotion classification to enhance data quality.
How It Works
The project uses Speaker Verification technology to categorize vocal timbre styles. It employs clustering algorithms (SpectralCluster, UmapHdbscan) with automatic optimization, cluster merging (--mer_cosine), and minimum cluster counts (--nmin). This automates timbre data refinement for consistent model inputs, offering novel filtering via emotion encoding and mixed-feature audio selection.
Quick Start & Requirements
Installation requires cloning and pip install -r requirements.txt. Prerequisites include Python 3.8 and Microsoft C++ Build Tools. GPU PyTorch is recommended; CPU PyTorch suffices for the timbre encoder only. Data must be structured in ./input/<speaker_name>/raw/wavs/. For emotion classification, download pytorch_Model.bin from Hugging Face at https://huggingface.co/audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim.
Highlighted Details
wav2vec2-large-robust-12-ft-emotion-msp-dim.--encoder mix for filtering audio based on dual features, useful for advanced VITS model prompting.Maintenance & Community
The project acknowledges community user "洛泠羽". No further details on maintainers, community channels, or roadmaps are provided in the current documentation snippet.
Licensing & Compatibility
No specific licensing information or compatibility notes for commercial use were found in the provided README content.
Limitations & Caveats
The project notes that the relationship between singing timbre variations and voiceprint differences is unclear, framing its application partly as experimental. Research in this area is described as scarce. Users may still need to manually merge small clusters after automated classification.
8 months ago
Inactive
jim-schwoebel
RVC-Boss