Vietnamese generative model (research paper & models)
Top 45.2% on sourcepulse
PhoGPT is a series of state-of-the-art generative language models specifically trained for Vietnamese. It offers a 3.7B parameter base model (PhoGPT-4B) and a chat-tuned variant (PhoGPT-4B-Chat), both featuring an 8192 context length. These models are designed for researchers and developers working with Vietnamese NLP tasks, providing a powerful foundation for applications requiring Vietnamese text generation and understanding.
How It Works
PhoGPT-4B was pre-trained from scratch on a massive 102B token Vietnamese corpus. The PhoGPT-4B-Chat variant is then fine-tuned on a dataset of 70K instructional prompts and responses, augmented with 290K conversational turns. This approach leverages a large context window and a specialized vocabulary to capture the nuances of the Vietnamese language, aiming for superior performance in generative tasks compared to existing open-source models.
Quick Start & Requirements
torch.bfloat16
or torch.float16
.bitsandbytes
for quantization.Highlighted Details
Maintenance & Community
Developed by VinAI Research. Further details on fine-tuning can be found in the llm-foundry
documentation.
Licensing & Compatibility
The README does not explicitly state the license. However, models hosted on Hugging Face are typically under Apache 2.0 or similar permissive licenses unless otherwise specified. Compatibility for commercial use would require explicit license confirmation.
Limitations & Caveats
The model is noted to perform poorly on reasoning, coding, and mathematics tasks. It may generate harmful, biased, or factually incorrect content, requiring cautious use and output validation.
8 months ago
1 day