LLM translator for many-to-many language translation
Top 59.2% on sourcepulse
ALMA is a suite of LLM-based translation models, offering three generations (ALMA, ALMA-R, X-ALMA) that progressively enhance translation quality and language coverage. It targets researchers and practitioners seeking state-of-the-art machine translation capabilities, providing significant improvements over existing models, including GPT-4 and WMT winners.
How It Works
ALMA employs a two-step fine-tuning process: initial fine-tuning on monolingual data followed by optimization on high-quality parallel data. ALMA-R further refines models using Contrastive Preference Optimization (CPO) with triplet preference data. X-ALMA extends this to 50 languages via a plug-and-play language-specific module architecture and a 5-step training recipe incorporating Adaptive-Rejection Preference Optimization. This modular approach and advanced optimization techniques enable broad language support and high performance.
Quick Start & Requirements
transformers
library. Example usage provided for X-ALMA with transformers
and peft
.Highlighted Details
Maintenance & Community
The project has seen recent activity with X-ALMA's release and acceptance at ICLR 2025. CPO has been merged into Hugging Face's trl
library. No specific community links (Discord/Slack) are provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. The models are hosted on Hugging Face, implying compatibility with the Hugging Face ecosystem. Commercial use implications are not detailed.
Limitations & Caveats
The README does not detail specific limitations or known bugs. Training X-ALMA is described as complex due to the need for numerous intermediate checkpoints. The third loading method for X-ALMA requires substantial GPU memory.
3 months ago
1 day