LLMsNineStoryDemonTower  by km1994

LLMs guide with practical examples in NLP, IR, multimodal, etc

created 2 years ago
2,093 stars

Top 21.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a comprehensive guide and practical resource for Large Language Models (LLMs) across various domains, targeting developers, researchers, and enthusiasts interested in NLP, multimodal AI, and efficient LLM deployment. It aims to demystify LLM applications through a structured "nine-story demon tower" approach, offering hands-on experience and insights into cutting-edge models and techniques.

How It Works

The project is organized into thematic "layers" and "floors," each dedicated to a specific LLM application area or model family. It covers foundational NLP tasks, parameter-efficient fine-tuning (PEFT) methods like LoRA and QLoRA, and practical applications such as text-to-image generation (Stable Diffusion), visual question answering (VQA), automatic speech recognition (Whisper), and text-to-speech. The structure facilitates a systematic exploration of LLMs, from core concepts to advanced implementations.

Quick Start & Requirements

  • Installation: Primarily relies on Python and Hugging Face libraries. Specific model implementations may require PyTorch, Transformers, and other dependencies detailed within individual sections.
  • Prerequisites: Access to GPUs is often necessary for fine-tuning and inference, with specific requirements (e.g., VRAM) varying by model size and technique. CUDA is frequently mentioned for GPU acceleration.
  • Resources: Setup complexity and resource demands vary significantly based on the chosen model and task, ranging from single-GPU fine-tuning to distributed training setups.
  • Links: Extensive links to GitHub repositories, Hugging Face model pages, research papers, and demos are provided for each model and technique.

Highlighted Details

  • Covers a wide array of popular LLMs including ChatGLM, LLaMA derivatives, Baichuan, Vicuna, and more.
  • Detailed tutorials on Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA, QLoRA, and P-Tuning.
  • Explores multimodal capabilities with models like BLIP-2, MiniGPT-4, and VisualGLM for VQA tasks.
  • Includes sections on inference acceleration (vLLM, FasterTransformer) and domain-specific applications (finance, medical, legal, coding).

Maintenance & Community

The repository aggregates information from various active open-source projects and research efforts. Community engagement is encouraged through QQ groups mentioned for discussion and support.

Licensing & Compatibility

Licenses vary by the individual models and projects referenced. Many models, such as ChatGLM2/3 and Baichuan, are open for academic research and offer free commercial use upon application/permission. LLaMA derivatives inherit Meta's licensing.

Limitations & Caveats

The repository is a curated collection of resources rather than a single cohesive codebase. Users must navigate individual project requirements and potential compatibility issues between different models and libraries. Some models may have specific hardware or data prerequisites not universally detailed.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
71 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.