self-llm by datawhalechina

LLM guide for Chinese users on Linux

Created 2 years ago

27,315 stars

Top 1.4% on SourcePulse

Project Summary

This repository provides a comprehensive tutorial for Chinese beginners on deploying and fine-tuning open-source Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) within a Linux environment. It aims to simplify the process of using and applying these models, making them more accessible to students and researchers.

How It Works

The tutorial covers the entire lifecycle of working with open-source LLMs, from initial environment configuration tailored to specific model requirements, to deploying and using popular models like LLaMA, ChatGLM, and InternLM. It also details various fine-tuning techniques, including full parameter fine-tuning and efficient methods like LoRA and P-tuning, enabling users to customize models for their specific needs.

Quick Start & Requirements

Installation: Primarily involves environment setup and model downloads via Hugging Face, ModelScope, or git-lfs. Specific commands depend on the chosen model and deployment method (e.g., vLLM, FastApi, LMStudio, Ollama).
Prerequisites: Linux environment, Python, potentially CUDA for GPU acceleration, and specific model dependencies.
Resources: Requires significant disk space for models and potentially powerful GPUs for efficient fine-tuning and inference.
Documentation: Comprehensive guides are available within the repository.

Highlighted Details

Supports a wide array of popular LLMs including Qwen, Kimi, Llama, Gemma, DeepSeek, and more.
Provides detailed tutorials for various deployment methods like FastApi, vLLM, and web demos.
Includes practical examples and case studies, such as creating a "Zhen Huan" chatbot or a math-focused LLM.
Offers guidance on integrating LLMs with frameworks like LangChain.

Maintenance & Community

The project is actively maintained by Datawhale members and contributors, with a clear structure for issues and pull requests. Contact information is provided for deeper involvement.

Licensing & Compatibility

The repository itself appears to be open-source, but the licensing of the individual models covered varies. Users should verify the license of each model they intend to use, especially for commercial applications.

Limitations & Caveats

The tutorial is primarily focused on Linux environments, and setup on other operating systems might require adaptation. While it covers many models, the rapid pace of LLM development means new models may not be immediately included.

self-llm by datawhalechina

Explore Similar Projects

LLM_in_Action by wangwei1237

LLM-Kit by wpydcr

LLM-Tuning by beyondguo

LLMs-Technology-Community-Beyondata by fufankeji

learning-langchain by langchain-ai

LLMsNineStoryDemonTower by km1994

KoAlpaca by Beomi

torchchat by pytorch

llama3-Chinese-chat by CrazyBoyM

xtuner by InternLM

llm-engineer-toolkit by KalyanKS-NLP

litgpt by Lightning-AI