LLMsNineStoryDemonTower by km1994

LLMs guide with practical examples in NLP, IR, multimodal, etc

Created 2 years ago

2,145 stars

Top 20.7% on SourcePulse

Project Summary

This repository serves as a comprehensive guide and practical resource for Large Language Models (LLMs) across various domains, targeting developers, researchers, and enthusiasts interested in NLP, multimodal AI, and efficient LLM deployment. It aims to demystify LLM applications through a structured "nine-story demon tower" approach, offering hands-on experience and insights into cutting-edge models and techniques.

How It Works

The project is organized into thematic "layers" and "floors," each dedicated to a specific LLM application area or model family. It covers foundational NLP tasks, parameter-efficient fine-tuning (PEFT) methods like LoRA and QLoRA, and practical applications such as text-to-image generation (Stable Diffusion), visual question answering (VQA), automatic speech recognition (Whisper), and text-to-speech. The structure facilitates a systematic exploration of LLMs, from core concepts to advanced implementations.

Quick Start & Requirements

Installation: Primarily relies on Python and Hugging Face libraries. Specific model implementations may require PyTorch, Transformers, and other dependencies detailed within individual sections.
Prerequisites: Access to GPUs is often necessary for fine-tuning and inference, with specific requirements (e.g., VRAM) varying by model size and technique. CUDA is frequently mentioned for GPU acceleration.
Resources: Setup complexity and resource demands vary significantly based on the chosen model and task, ranging from single-GPU fine-tuning to distributed training setups.
Links: Extensive links to GitHub repositories, Hugging Face model pages, research papers, and demos are provided for each model and technique.

Highlighted Details

Covers a wide array of popular LLMs including ChatGLM, LLaMA derivatives, Baichuan, Vicuna, and more.
Detailed tutorials on Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA, QLoRA, and P-Tuning.
Explores multimodal capabilities with models like BLIP-2, MiniGPT-4, and VisualGLM for VQA tasks.
Includes sections on inference acceleration (vLLM, FasterTransformer) and domain-specific applications (finance, medical, legal, coding).

Maintenance & Community

The repository aggregates information from various active open-source projects and research efforts. Community engagement is encouraged through QQ groups mentioned for discussion and support.

Licensing & Compatibility

Licenses vary by the individual models and projects referenced. Many models, such as ChatGLM2/3 and Baichuan, are open for academic research and offer free commercial use upon application/permission. LLaMA derivatives inherit Meta's licensing.

Limitations & Caveats

The repository is a curated collection of resources rather than a single cohesive codebase. Users must navigate individual project requirements and potential compatibility issues between different models and libraries. Some models may have specific hardware or data prerequisites not universally detailed.

LLMsNineStoryDemonTower by km1994

Explore Similar Projects

SiLLM by armbues

Awesome-AGI by ArronAI007

intro-llm-code by intro-llm

Building-LLM-Powered-Applications by PacktPublishing

Transformers-for-NLP-and-Computer-Vision-3rd-Edition by Denis2054

ragbook-notebooks by towardsai

LLM-workshop-2024 by rasbt

xtuner by InternLM

ms-swift by modelscope

llm-engineer-toolkit by KalyanKS-NLP

llm-action by liguodongiot

self-llm by datawhalechina