MR-Models by mtkresearch

Foundation models for Traditional Chinese language and multimodal tasks

Created 2 years ago

254 stars

Top 99.1% on SourcePulse

Project Summary

Foundation Models for Traditional Chinese Language.

MediaTek Research Foundation Models (MR-Models) delivers specialized foundation models for Traditional Chinese language understanding and generation. Targeting researchers and industry professionals, these models enhance linguistic and cultural representation, aiming to accelerate AI development and application within the Traditional Chinese-speaking community.

How It Works

The project offers foundation models built on LLaMA, Mixtral, and Mistral architectures, specifically optimized for Traditional Chinese language understanding and generation. Key advancements include expanded vocabularies with tens of thousands of Traditional Chinese tokens, resulting in up to 2x faster inference speeds for Chinese tasks compared to their base models. Models like Breeze 2 introduce multimodal capabilities by integrating vision encoders and supporting function-calling through post-training. Additionally, advanced speech synthesis models are available, featuring support for voice cloning and multilingual capabilities.

Quick Start & Requirements

The mtkresearch package is available on PyPi. Specific installation commands and detailed prerequisites are not provided. Links to papers, demos, and model collections are available for individual model families (Breeze 2, BreeXe, Breeze).

Highlighted Details

Breeze 2 Family: Available in 3B and 8B parameters, based on LLaMA 3.2. Features vision-aware capabilities via a visual encoder and bridge module, alongside function-calling support through prompt templates and post-training.
BreeXe-8x7B Family: Built on Mixtral-8x7B with an expanded vocabulary (30,000+ Traditional Chinese tokens). Achieves 2x faster Traditional Chinese inference speed and demonstrates performance comparable to OpenAI's gpt-3.5-turbo-1106.
Breeze-7B Family: Based on Mistral-7B with an expanded vocabulary (30,000+ Traditional Chinese tokens). Offers 2x faster Traditional Chinese inference speed and shows significant advantages over contemporary open-source models like Taiwan-LLM and QWen.
Speech Synthesis Models: Support Traditional Chinese, English, and Bopomofo input, with Chinese speech output. Includes advanced voice cloning capabilities.

Maintenance & Community

No specific details regarding maintenance, community channels (e.g., Discord/Slack), or notable contributors are present in the provided README content.

Licensing & Compatibility

Models are provided for academic research or industry use under unspecified terms of use. Specific license types and commercial compatibility details are not detailed.

MR-Models by mtkresearch

Explore Similar Projects

Ming-UniAudio by inclusionAI

Lyra by JIA-Lab-research

OSUM by ASLP-lab

Meta-voicebox by SpeechifyInc

deepspeech-german by AASHISHAG

magma by Aleph-Alpha-Research

NanoLLM by dusty-nv

SpeechGPT by 0nutation

SLAM-LLM by X-LANCE

ichigo by janhq

mlx-examples by ml-explore

unilm by microsoft