ChatMusician by hf-lin

LLM for music understanding and generation

Created 2 years ago

294 stars

Top 90.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Wing Lian

Founder of Axolotl AI

Project Summary

ChatMusician is an open-source Large Language Model (LLM) designed to understand and generate music intrinsically, treating it as a second language. It targets researchers and developers interested in multimodal AI and creative language generation, offering a pure text-based approach to music composition and analysis without external multi-modal components.

How It Works

ChatMusician is built upon LLaMA2-7B, continually pre-trained and fine-tuned on ABC notation, a text-compatible music representation. This approach allows the model to process and generate music using only text tokenizers, simplifying the architecture and enabling seamless integration with existing LLM frameworks. The model is advantageous for its ability to compose structured, full-length music conditioned on various musical elements like text, chords, and melodies, while retaining strong general language capabilities.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.8+, PyTorch 2.0+, CUDA 11.4+, DeepSpeed 0.10+. For audio demo: abcmidi and MuseScore.
Setup: Local inference requires downloading model weights and potentially pre-processing data.
Links: DemoPage, Code, Pretrain Dataset, SFT Dataset, Benchmark, arXiv

Highlighted Details

Composes well-structured, full-length music conditioned on text, chords, melodies, motifs, and musical forms.
Surpasses LLaMA2 and GPT-3.5 on the MusicTheoryBench benchmark in zero-shot settings.
Retains and slightly improves general language abilities, evidenced by MMLU scores.
Utilizes ABC notation as a text-compatible music representation for intrinsic LLM integration.

Maintenance & Community

The project was released on 2023-12-10, with active development indicated by recent updates. Community support is available via GitHub issues.

Licensing & Compatibility

The project is released under a permissive license, allowing for commercial use and integration with closed-source applications.

Limitations & Caveats

ChatMusician currently supports only strict format and closed-ended instructions for music tasks, with plans to improve generalization with more diverse data. It may suffer from hallucinations and is not recommended for music education without further refinement. A significant portion of the training data is in the style of Irish music, and in-context learning/chain-of-thoughts abilities are noted as weak.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days