Skywork by SkyworkAI

LLM for multilingual tasks, creative writing, math, and multimodal applications

Created 2 years ago

1,463 stars

Top 27.8% on SourcePulse

Project Summary

The Skywork project provides a series of 13B parameter large language models (LLMs) trained on 3.2TB of multilingual and code data, aiming to offer strong performance across general tasks, creative writing, and mathematical reasoning. It targets researchers and developers seeking high-quality, open-source bilingual (Chinese/English) models with commercial use potential.

How It Works

Skywork models are built on a thinner, deeper architecture (52 layers) compared to Llama-2-13B, with a larger vocabulary size (65,536) achieved via BPE tokenization. Training involves a two-stage process: initial pre-training on general corpora, followed by a second stage incorporating STEM data to boost reasoning and mathematical abilities. The project also releases Skypile-150B, a 600GB Chinese dataset, and offers quantized versions for consumer GPU deployment.

Quick Start & Requirements

Installation: pip install -r requirements.txt
Prerequisites: Python 3.8+, PyTorch 2.0+, CUDA 11.4+ recommended.
Resources: Quantized versions support consumer GPUs. Full model details and inference examples are provided in the README.
Links: Hugging Face, ModelScope, Tech Report

Highlighted Details

Skywork-13B-Base achieves top performance among 13B models on benchmarks like C-Eval (60.6), CMMLU (61.8), MMLU (62.1), and GSM8K (55.8).
Skywork-13B-Math ranks first in GSM8K and CMATH benchmarks for its scale.
Skywork-13B-Chat is fine-tuned for creative writing tasks, showing ChatGPT-like results.
Skywork-13B-MM is a multimodal model for image-based Q&A.
Offers 8-bit quantized models with minimal performance degradation and reduced GPU memory usage (13.57GB vs 25.91GB for bf16).

Maintenance & Community

The project is developed by the Kunlun Group · Skywork team. Integration with Huawei's MindFormers suite on Ascend hardware is available.

Licensing & Compatibility

The models are available under the "Skywork Community License" and support commercial use, provided terms are followed. Usage is restricted from activities threatening national/social security or unlawful actions.

Limitations & Caveats

The SkyPile-150B dataset, while filtered, may still contain sensitive information. The project disclaims responsibility for risks arising from model misuse or unforeseen issues. Some model variants (Chat, MM) are listed as "coming soon" on certain platforms.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days