Mengzi  by Langboat

Chinese language models for NLP tasks, emphasizing efficiency

Created 4 years ago
538 stars

Top 59.1% on SourcePulse

GitHubView on GitHub
Project Summary

Mengzi offers a suite of lightweight yet powerful pre-trained language models for Chinese NLP tasks, targeting researchers and developers seeking efficient deployment. The models aim to provide competitive performance with reduced computational costs, making them suitable for industrial applications.

How It Works

Mengzi models leverage linguistic information and training acceleration techniques to achieve high performance with smaller parameter counts. They maintain compatibility with existing BERT and T5 architectures, allowing for seamless integration into current NLP pipelines. This approach prioritizes efficiency and ease of deployment without sacrificing model quality.

Quick Start & Requirements

  • Installation: pip install transformers or pip install paddlenlp.
  • Usage: Load models using Hugging Face transformers or PaddleNLP.
    # Hugging Face example
    from transformers import BertTokenizer, BertModel
    tokenizer = BertTokenizer.from_pretrained("Langboat/mengzi-bert-base")
    model = BertModel.from_pretrained("Langboat/mengzi-bert-base")
    
  • Dependencies: Python, transformers or paddlenlp. No specific hardware requirements are listed beyond standard ML environments.
  • Resources: Models are available on Hugging Face and via direct download links.

Highlighted Details

  • Offers various BERT and T5-based models, including specialized versions for finance (Mengzi-BERT-base-fin) and multi-task learning (Mengzi-T5-base-MT).
  • Includes multimodal models (Mengzi-Oscar-base) for image-text tasks and generative models like Mengzi-GPT-neo-base and BLOOM variants fine-tuned on Chinese data.
  • Achieves competitive results on the CLUE benchmark, often outperforming standard models like RoBERTa-wwm-ext.
  • Provides a Chinese AI writing model capable of generating poetry and couplets.

Maintenance & Community

  • Active updates noted until late 2022.
  • Contact available via WeChat discussion group and email.
  • A technical report is available via arXiv.

Licensing & Compatibility

  • Models are available under a permissive license allowing for arbitrary use within its scope.
  • No explicit restrictions on commercial use or closed-source linking are mentioned.

Limitations & Caveats

  • The project's training code is not open-sourced due to internal infrastructure coupling.
  • While Mengzi-T5-base is T5-compatible, it requires fine-tuning for specific text generation tasks and does not include downstream tasks out-of-the-box.
  • The project's last update was in November 2022, indicating potential staleness.
Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
12 more.

gpt-3 by openai

0.0%
16k
Research paper on large language model few-shot learning
Created 5 years ago
Updated 5 years ago
Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
18 more.

lectures by oxford-cs-deepnlp-2017

0.0%
16k
NLP course (lecture slides) for deep learning approaches to language
Created 8 years ago
Updated 2 years ago
Feedback? Help us improve.