Discover and explore top open-source AI tools and projects—updated daily.
Finetuned LLaMA for Portuguese instruction following
Top 57.1% on SourcePulse
Cabrita is a Portuguese-language instruction-tuned LLaMA model designed for research purposes. It addresses the need for Portuguese NLP capabilities by finetuning the LLaMA-7B model using a translated version of the Stanford Alpaca dataset. The project provides resources and steps for replicating the finetuning process, enabling researchers to explore Portuguese language model development.
How It Works
The project leverages the Alpaca Lora codebase for finetuning LLaMA. The Stanford Alpaca dataset was translated to Portuguese using ChatGPT, incurring minimal cost. This approach allows for efficient finetuning, with the authors reporting impressive results after just one hour of training on a single A100 GPU.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The dataset translation, while cost-effective, is noted as not being of the highest quality. The model is strictly for research use and prohibits commercial applications.
2 years ago
Inactive