semikong by aitomatic

Semiconductor LLM for domain-specific tasks

Created 1 year ago

389 stars

Top 73.8% on SourcePulse

Project Summary

SEMIKONG is an open-source, industry-specific large language model (LLM) designed for the semiconductor manufacturing domain. It addresses the need for specialized AI capabilities in this complex field by providing models trained on a comprehensive corpus of semiconductor-related text, enabling better understanding of physics, chemistry, and processes. The project targets engineers, researchers, and companies in the semiconductor industry, offering a foundation for building proprietary AI solutions and improving productivity.

How It Works

SEMIKONG leverages the Transformer architecture and is based on the Llama model, allowing seamless integration with the existing Llama ecosystem. It utilizes a novel pre-training approach incorporating domain-specific knowledge to achieve superior performance on industry-relevant benchmarks compared to general-purpose LLMs. The project offers both 8B and 70B parameter instruct models, with weights available on Hugging Face.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/aitomatic/semikong.git), navigate into the directory (cd semikong), and install dependencies (pip install -r requirements.txt).
Prerequisites: Python 3.10+, CUDA. For 4-bit quantized models, AWQ is required; for 8-bit, GPTQ is required.
Hardware: SEMIKONG-8B-Instruct requires a minimum of 16 GB VRAM (e.g., RTX 3060 12GB). SEMIKONG-70B-Instruct requires a minimum of 170 GB VRAM (e.g., 3 x A100 80GB). Fine-tuning 70B requires significant CPU memory (900GB+) and multiple high-VRAM GPUs.
Resources: Hugging Face Models, SemiKong Paper, Web Demo

Highlighted Details

First open-source, industry-specific LLM for semiconductor manufacturing.
SEMIKONG-70B-Chat model ranks first among open-source models on benchmarks like MMLU, CMMLU, BBH, and GSM8k.
Models are compatible with Llama ecosystem tools (e.g., LlamaForCausalLM, LlamaTokenizer).
Offers fine-tuning scripts and deployment guidance for various quantization methods (AWQ, GPTQ).

Maintenance & Community

The project is a collaborative effort involving Tokyo Electron, FPT Software AIC, and Aitomatic, with contributions from AI Alliance members. Discussions can be held on GitHub.

Licensing & Compatibility

The code and weights are distributed under the Apache 2.0 License, permitting personal, academic, and commercial use. Derivative works require attribution.

Limitations & Caveats

While efforts are made for data compliance, the model may still produce incorrect or problematic outputs due to data complexity and usage scenarios. The project disclaims responsibility for risks arising from misuse.

semikong by aitomatic

Explore Similar Projects

Mengzi3 by Langboat

ToolkenGPT by Ber666

TeleChat2 by Tele-AI

snowflake-arctic by Snowflake-Labs

llm_qlora by georgesung

CoLLiE by OpenMOSS

tiny-llm-zh by wdndev

dbrx by databricks

Yi by 01-ai

ms-swift by modelscope

alpaca-lora by tloen

OpenNMT-py by OpenNMT