MOSS by OpenMOSS

Open-source tool-augmented conversational language model

Created 2 years ago

12,077 stars

Top 4.1% on SourcePulse

View on GitHub

7 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Omar Sanseviero

DevRel at Google DeepMind

Ying Sheng

Coauthor of SGLang

Jeff Hammerbacher

Cofounder of Cloudera

and 3 more!

Project Summary

MOSS is an open-source, tool-augmented conversational language model developed by Fudan University, designed for bilingual (Chinese/English) interaction and capable of utilizing various plugins. It aims to provide a helpful, honest, and harmless AI assistant for a wide range of language-based tasks, targeting researchers and developers looking to deploy or fine-tune advanced conversational AI.

How It Works

MOSS is built upon a 16-billion parameter foundation model pre-trained on approximately 700 billion tokens of Chinese, English, and code data. It undergoes instruction fine-tuning for dialogue capabilities and safety, followed by plugin-enhanced learning for tool usage (search, calculator, equation solver, text-to-image). Further refinement includes human preference training for improved factuality and safety. Quantized versions (INT4/INT8) are available for reduced memory footprint.

Quick Start & Requirements

Install: Clone the repository and install dependencies via pip install -r requirements.txt.
Prerequisites: Python 3.8+, PyTorch (>=1.13.1+cu117 recommended), Transformers. Triton backend for quantization is Linux/WSL only.
Hardware: FP16 requires ~31GB VRAM for loading, INT4 requires ~7.8GB VRAM. Multi-GPU support is available for 3090s.
Demos: Streamlit (moss_web_demo_streamlit.py), Gradio (moss_web_demo_gradio.py), CLI (moss_cli_demo.py), and API demos are provided.

Highlighted Details

Offers multiple model variants: base, instruction-tuned, plugin-enhanced, and quantized (INT4/INT8).
Supports tool integration for web search, calculation, equation solving, and text-to-image generation.
Provides comprehensive deployment options including single/multi-GPU, quantization, and various demo interfaces.
Fine-tuning scripts are available for custom training on dialogue data.

Maintenance & Community

Developed by Fudan University.
Community contributions are welcomed via Pull Requests. Links to WeChat groups are provided.

Licensing & Compatibility

Code: Apache 2.0
Data: CC BY-NC 4.0
Model Weights: GNU AGPL 3.0
Commercial use requires signing a document and filling a questionnaire for authorization; no fees are charged for this authorization.

Limitations & Caveats

MOSS may still generate factually incorrect or biased/harmful content due to its model size and autoregressive nature. Users are cautioned against spreading harmful content generated by the model. Quantized models currently only support single-card deployment.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days